How Chubby Works
What happens when a Chubby cell boots up, a client connects for the first time, or an application needs to elect a leader? This section walks through the operational mechanics.
Service initialization
When a Chubby cell starts:
- Replicas run Paxos to elect a master.
- The master's identity is persisted in storage and propagated to all replicas.
From this point, the master handles every client request. Replicas participate in consensus but do not serve clients directly.
Client initialization
When a Chubby client starts:
- The client queries DNS for the list of Chubby replicas in the cell.
- The client sends an RPC to any replica.
- If that replica is not the master, it returns the master's address.
- The client establishes a session with the master and sends all subsequent requests to it -- until the master indicates it is no longer the leader or stops responding.
This client discovery pattern -- contact any node, get redirected to the leader -- appears in many distributed systems (etcd, ZooKeeper, Kafka controller). It avoids a single point of failure for discovery while keeping writes centralized for consistency.
Leader election example
Consider an application with multiple instances that need exactly one leader. Here is how they use Chubby:
- All candidates attempt to acquire a Chubby lock on a designated election file.
- The first to acquire the lock becomes the leader.
- The leader writes its identity into the file so other instances can discover who won.
- If the leader crashes and its session lease expires, it loses the lock. Another candidate acquires it and becomes the new leader.
Pseudocode for leader election
This shows how little code is needed to add Chubby-based leader election to an existing application:
/* Create these files in Chubby manually once.
Usually there are at least 3-5 required for minimum quorum requirement. */
lock_file_paths = \{
"ls/cell1/foo",
"ls/cell2/foo",
"ls/cell3/foo",
\}
Main() \{
// Initialize Chubby client library.
chubbyLeader = newChubbyLeader(lock_file_paths)
// Establish client's connection with Chubby service.
chubbyLeader.Start()
// Waiting to become the leader.
chubbyLeader.Wait()
// Becomes Leader
Log("Is Leader: " + chubbyLeader.isLeader ())
While(chubbyLeader.renewLeader ()) \{
// Do work
\}
// Not leader anymore.
\}
Leader election with locks requires careful handling of the "zombie leader" problem. If an old leader's session expires but it hasn't realized it yet, it may still issue commands. Chubby addresses this with sequencers and lock-delay (covered in Locks, Sequencers, and Lock-delays) -- the same concept as Fencing tokens.