Skip to main content

Consumer Groups

A single consumer reading a topic with 100 partitions is a bottleneck. Consumer groups solve this by letting multiple consumers share the work of reading a topic -- while also enabling the pub-sub model where multiple independent groups each get all messages.

Think first
A single consumer reading from a topic with 100 partitions can only process one partition at a time. How would you scale consumption without duplicating messages? What constraint should exist between partitions and consumers to maintain ordering guarantees?

What is a consumer group?

A consumer group is a set of consumers that cooperatively consume a topic. Kafka ensures:

  • Each partition is assigned to exactly one consumer within the group
  • Each consumer may be assigned multiple partitions
  • Messages in a partition are processed in order by a single consumer

Partition assignment rules

ScenarioWhat happens
Consumers = partitionsEach consumer gets exactly one partition (ideal)
Consumers < partitionsSome consumers handle multiple partitions
Consumers > partitionsExtra consumers sit idle (standby for failover)

When a consumer joins or leaves the group, Kafka rebalances -- redistributes partitions among the remaining consumers. This is automatic.

Think first
You need both work distribution (each message processed once) and broadcasting (every service gets every message). Traditional systems force you to choose one model. How could you use the concept of consumer groups to support both simultaneously?

Queue vs. pub-sub: both at once

This is Kafka's elegant unification of the two messaging models:

  • Queue behavior: Put all consumers in one group. Each message goes to exactly one consumer. Work is distributed.
  • Pub-sub behavior: Put each consumer in its own group. Every group gets every message. Each group processes independently.
  • Hybrid: Multiple groups, each with multiple consumers. Each group gets all messages; within each group, work is distributed.
Interview angle

Consumer groups are the answer to "How does Kafka support both queue and pub-sub semantics?" One group = queue (load balancing). Multiple groups = pub-sub (broadcast). The partition count determines max parallelism within a group -- you can't have more active consumers than partitions.

Quiz
A topic has 4 partitions and a consumer group has 6 consumers. What happens to the 2 extra consumers, and what happens if one of the active consumers crashes?