Skip to main content

Kafka Deep Dive

A Kafka topic is a logical concept. Physically, it's split into partitions -- and partitions are where all the action happens.

Topic partitions

A topic is spread across multiple partitions, each of which can live on a different broker. This is how Kafka achieves parallelism and scalability.

Think first
If a topic is just one big log on one broker, what limits its throughput and storage? How would you break it up so that multiple brokers can share the load and multiple consumers can read in parallel?

Key properties of partitions:

PropertyDetail
OrderingMessages are strictly ordered within a partition (by offset), but not across partitions
OffsetEach message gets a unique, monotonically increasing sequence number within its partition
ImmutabilityOnce written, messages cannot be modified
AddressingA message is uniquely identified by (topic, partition, offset)
ParallelismEach partition can be consumed independently -- more partitions = more consumer parallelism
StorageEach partition can hold more data than a single broker by splitting across brokers

How producers choose a partition

  • Key-based: If a message has a key, Kafka hashes it to determine the partition. Messages with the same key always go to the same partition (guaranteeing ordering for that key).
  • Round-robin: Without a key, messages are distributed evenly across partitions.

Dumb broker, smart consumer

Kafka doesn't track what each consumer has read. Instead, consumers track their own offsets -- they tell Kafka which offset to start reading from. This enables:

  • Replay: Reset the offset to re-process old messages
  • Late joining: New consumers can start from the beginning or from the latest offset
  • Independent progress: Each consumer advances at its own pace

Replication: leaders and followers

Every partition has one leader and zero or more followers -- a direct application of the Leader and Follower pattern.

  • Leader: Handles all reads and writes for the partition
  • Followers: Replicate the leader's log as backup. Can become leader if the current leader fails.

ZooKeeper (or KRaft) stores the leader location for each partition. Producers and consumers talk directly to partition leaders.

In-sync replicas (ISR)

An ISR is a replica that has fully caught up with the leader. Only ISRs are eligible to become the new leader during failover. The ISR set is dynamic -- a slow follower drops out, and rejoins when it catches up.

Think first
Imagine a consumer reads the very latest message from the leader, but the leader crashes before that message is replicated to any follower. The new leader does not have that message. What consistency problem does this create, and how might you prevent it?

High-water mark

How does Kafka prevent consumers from reading data that might disappear if the leader crashes? With the High-Water Mark pattern.

The high-water mark is the highest offset replicated to all ISRs. Consumers can only read up to this point. Everything above it is "uncommitted" -- it exists on the leader but hasn't been confirmed by all ISRs.

Why this matters: If a consumer reads offset 7 from the leader, then the leader crashes before offset 7 is replicated, the new leader won't have that message. The consumer experiences a non-repeatable read. The high-water mark prevents this entirely.

Interview angle

Kafka's partitioning model is the answer to "How do you scale a messaging system?" More partitions = more parallelism. Replication with ISR = fault tolerance. High-water mark = consistency guarantee. Walk through all three layers to show you understand the complete picture.

Quiz
A topic has 3 partitions, each with a replication factor of 3 (1 leader + 2 followers). One follower for Partition 0 falls behind and is removed from the ISR. What happens to the high-water mark for Partition 0?