High-level Architecture
How does a single lock service handle coordination for thousands of distributed processes without becoming a bottleneck itself? Chubby's architecture answers this with a compact, replicated design built on top of Paxos consensus.
Core terminology
| Term | Definition |
|---|---|
| Chubby cell | A Chubby cluster, typically 5 replicas in the same data center (though some cells span thousands of kilometers) |
| Master | The single replica elected via Paxos that handles all client requests |
| Replica | A server in the cell that participates in consensus and stores a copy of the data |
| Client library | An application-linked library that communicates with the master via RPC |
Server architecture
- A Chubby cell consists of a small set of servers (typically 5) known as replicas.
- One replica is elected master using Paxos -- it handles all client requests.
- If the master fails, another replica takes over (see Split-brain for how epoch numbers prevent conflicts).
- The master writes directly to its local database, then syncs asynchronously to all replicas -- ensuring data reliability even during master failover.
- Replicas are placed on different racks for fault tolerance.
Client architecture
Client applications communicate with the master through a Chubby client library using RPC.
When asked "how would you implement distributed coordination?", sketch Chubby's architecture: a small odd-numbered replica set (5 nodes), Paxos-elected master, clients talk only to the master. This is the same architecture ZooKeeper uses and the foundation for any coordination service answer.
Chubby APIs
Chubby exports a file system interface similar to POSIX, but much simpler. The namespace is a strict tree of files and directories with slash-separated paths:
Path format: /ls/chubby_cell/directory_name/.../file_name
/lsdesignates the lock service (Chubby system)chubby_cellidentifies the specific Chubby instance- The remainder is a series of directories ending in a file name
- The special name
/ls/localresolves to the nearest cell relative to the caller
Chubby was originally a pure lock service, but its creators discovered that associating small amounts of data with each lock entity was extremely useful. Each entity can now serve as a lock, a small file, or both.
API groups
| Group | API | Description |
|---|---|---|
| General | Open() | Opens a named file or directory, returns a handle |
Close() | Closes an open handle | |
Poison() | Cancels all Chubby calls by other threads without deallocating memory | |
Delete() | Deletes the file or directory | |
| File | GetContentsAndStat() | Atomically returns entire file contents + metadata (discourages large files) |
GetStat() | Returns metadata only | |
ReadDir() | Returns names and metadata of all children in a directory | |
SetContents() | Atomically writes entire file contents | |
SetACL() | Writes new access control list information | |
| Locking | Acquire() | Acquires a lock on a file (blocking) |
TryAcquire() | Non-blocking lock acquisition attempt | |
Release() | Releases a lock | |
| Sequencer | GetSequencer() | Gets a string representation of a lock's state |
SetSequencer() | Associates a sequencer with a handle | |
CheckSequencer() | Validates whether a sequencer is still current |