Kafka Deep Dive

A Kafka topic is a logical concept. Physically, it's split into partitions -- and partitions are where all the action happens.

Topic partitions

A topic is spread across multiple partitions, each of which can live on a different broker. This is how Kafka achieves parallelism and scalability.

Think first

If a topic is just one big log on one broker, what limits its throughput and storage? How would you break it up so that multiple brokers can share the load and multiple consumers can read in parallel?

Key properties of partitions:

Property	Detail
Ordering	Messages are strictly ordered within a partition (by offset), but not across partitions
Offset	Each message gets a unique, monotonically increasing sequence number within its partition
Immutability	Once written, messages cannot be modified
Addressing	A message is uniquely identified by (topic, partition, offset)
Parallelism	Each partition can be consumed independently -- more partitions = more consumer parallelism
Storage	Each partition can hold more data than a single broker by splitting across brokers

How producers choose a partition

Key-based: If a message has a key, Kafka hashes it to determine the partition. Messages with the same key always go to the same partition (guaranteeing ordering for that key).
Round-robin: Without a key, messages are distributed evenly across partitions.

Dumb broker, smart consumer

Kafka doesn't track what each consumer has read. Instead, consumers track their own offsets -- they tell Kafka which offset to start reading from. This enables:

Replay: Reset the offset to re-process old messages
Late joining: New consumers can start from the beginning or from the latest offset
Independent progress: Each consumer advances at its own pace

Replication: leaders and followers

Every partition has one leader and zero or more followers -- a direct application of the Leader and Follower pattern.

Leader: Handles all reads and writes for the partition
Followers: Replicate the leader's log as backup. Can become leader if the current leader fails.

ZooKeeper (or KRaft) stores the leader location for each partition. Producers and consumers talk directly to partition leaders.

In-sync replicas (ISR)

An ISR is a replica that has fully caught up with the leader. Only ISRs are eligible to become the new leader during failover. The ISR set is dynamic -- a slow follower drops out, and rejoins when it catches up.

Think first

Imagine a consumer reads the very latest message from the leader, but the leader crashes before that message is replicated to any follower. The new leader does not have that message. What consistency problem does this create, and how might you prevent it?

High-water mark

How does Kafka prevent consumers from reading data that might disappear if the leader crashes? With the High-Water Mark pattern.

The high-water mark is the highest offset replicated to all ISRs. Consumers can only read up to this point. Everything above it is "uncommitted" -- it exists on the leader but hasn't been confirmed by all ISRs.

Why this matters: If a consumer reads offset 7 from the leader, then the leader crashes before offset 7 is replicated, the new leader won't have that message. The consumer experiences a non-repeatable read. The high-water mark prevents this entirely.

Interview angle

Kafka's partitioning model is the answer to "How do you scale a messaging system?" More partitions = more parallelism. Replication with ISR = fault tolerance. High-water mark = consistency guarantee. Walk through all three layers to show you understand the complete picture.

Quiz

A topic has 3 partitions, each with a replication factor of 3 (1 leader + 2 followers). One follower for Partition 0 falls behind and is removed from the ISR. What happens to the high-water mark for Partition 0?

The high-water mark stays the same because it only requires a majority of replicas.

The high-water mark may advance further because the slow follower is no longer holding it back -- the HWM only tracks ISR members.

The high-water mark resets to zero because the ISR changed.

Producers can no longer write to Partition 0 until the follower catches up.

Topic partitions​

How producers choose a partition​

Dumb broker, smart consumer​

Replication: leaders and followers​

In-sync replicas (ISR)​

High-water mark​