Skip to main content

Kafka Workflow

Let's trace the end-to-end flow of messages through Kafka in both messaging models.

Think first
In many messaging systems, the broker pushes messages to consumers. What are the trade-offs of a push model vs. a pull model? Consider what happens when a consumer is slow or temporarily unavailable.

Pub-sub workflow

  1. Producer publishes a message to a topic
  2. Broker stores the message in the appropriate partition (key-based or round-robin)
  3. Consumer subscribes to the topic and receives the current offset
  4. Consumer polls Kafka at regular intervals for new messages
  5. Kafka delivers new messages to the consumer
  6. Consumer processes the message and sends an acknowledgment (offset commit)
  7. Kafka advances the offset (stored in ZooKeeper or an internal __consumer_offsets topic)
  8. Repeat -- consumer keeps polling for new messages

Key detail: Consumers pull messages from Kafka (Kafka doesn't push). This lets each consumer control its own pace. If a consumer falls behind, it simply has more messages waiting; Kafka doesn't slow down other consumers.

Replay capability: Since offsets are just numbers, a consumer can rewind to any offset and re-process old messages. This is useful for recomputing derived data, debugging, or recovering from processing errors.

Think first
When multiple consumers share a group ID, Kafka must decide which consumer gets which partition. What should happen when a new consumer joins or an existing one crashes? What challenges does this rebalancing introduce?

Consumer group workflow

The same flow, but with work distribution:

  1. Producer publishes to a topic (same as above)
  2. A single consumer subscribes with a group ID
  3. When a second consumer joins with the same group ID, Kafka switches to shared mode:
    • Each partition is assigned to exactly one consumer in the group
    • Messages are distributed, not duplicated
  4. If the number of consumers exceeds partitions, excess consumers wait idle as standbys
  5. If a consumer leaves or crashes, its partitions are rebalanced to remaining consumers
Queue vs. pub-sub decision point

Same topic, same group ID → queue (messages distributed). Same topic, different group IDs → pub-sub (messages broadcast to each group). This is the only configuration difference between the two models.

Quiz
Kafka uses a pull model where consumers poll for messages. What would happen if Kafka used a push model instead, where the broker sends messages to consumers as they arrive?