Role of ZooKeeper
Kafka brokers are stateless -- they rely on ZooKeeper (a distributed coordination service, similar to Chubby) for all cluster metadata and coordination.
What ZooKeeper stores for Kafka
| Metadata | Purpose |
|---|---|
| Broker registry | Which brokers are alive and their addresses |
| Topic configuration | Topics, their partition counts, replication factors |
| Partition leadership | Which broker is the leader for each partition |
| Consumer offsets | Last committed offset per consumer group per partition (legacy; modern clients use an internal Kafka topic) |
| ACLs | Access control lists for topic authorization |
How producers find the partition leader
In modern Kafka, clients no longer talk directly to ZooKeeper. Instead:
- Producer connects to any broker and asks: "Who is the leader for Partition 1?"
- The broker (which gets this info from ZooKeeper) responds with the leader broker's address
- Producer connects to the leader broker directly and publishes the message
Fault tolerance
ZooKeeper replicates its data across its own cluster, so a Kafka broker failure (or ZooKeeper node failure) doesn't lose any cluster state. If ZooKeeper temporarily goes down, Kafka continues operating with its last-known state. When ZooKeeper recovers, the full state is restored.
ZooKeeper is also responsible for triggering partition leader election when a broker fails -- it notifies the controller broker, which then assigns new leaders for the failed broker's partitions.
Apache Kafka is moving away from ZooKeeper with KRaft mode (Kafka Raft). In KRaft, a subset of Kafka brokers acts as the coordination layer using the Raft consensus protocol, eliminating the external ZooKeeper dependency. The metadata and coordination concepts remain the same -- only the implementation changes.