Database
Where does Chubby actually store its data? The answer changed over time -- and the migration story reveals important lessons about dependency management in distributed systems.
Evolution of Chubby's storage
Chubby initially used a replicated version of Berkeley DB. The Chubby team eventually replaced it with a custom-built database, citing the risk of depending on a third-party storage engine for such a critical service.
| Property | Berkeley DB (original) | Custom DB (replacement) |
|---|---|---|
| Data model | B-tree based | Simple key/value store |
| Transaction support | Full ACID transactions | Atomic operations only (no general transactions) |
| Replication | Berkeley DB's built-in replication | Database log distributed via Paxos |
| Durability | Write-ahead logging | Write-ahead logging + snapshotting |
| Maintenance risk | External dependency | Fully controlled by the Chubby team |
Chubby's migration from Berkeley DB to a custom store illustrates a recurring theme: critical infrastructure eventually needs to own its dependencies. The same reasoning drove Google to build Colossus (replacing GFS's single-master design) and Spanner (replacing ad-hoc sharding). When interviewing, mention this trade-off: external dependencies reduce initial effort but create long-term risk for foundational services.
Backup strategy
Chubby uses a Write-ahead Log (WAL) for durability. Every database transaction is recorded in the log before being applied. To prevent unbounded log growth:
- Every few hours, the master writes a snapshot of its database to a GFS server in a different building.
- After a successful snapshot, the previous transaction log is deleted.
- At any point, the complete system state = last snapshot + subsequent transaction log entries.
| Backup concern | How Chubby addresses it |
|---|---|
| Building-level failures | Snapshots stored in a separate building |
| Cyclic dependencies | GFS cell in the same building might depend on this Chubby cell for master election -- so backups go to a different building's GFS |
| Replica initialization | New replicas bootstrap from backup snapshots instead of loading from other replicas |
| Disaster recovery | Backups enable full state reconstruction from scratch |
The cyclic dependency concern is subtle but critical. Chubby elects leaders for GFS and BigTable. If Chubby backed up to a GFS cell that depended on the same Chubby cell, a failure could create a deadlock where neither system can recover. Always map your dependency graph when designing backup strategies.
Mirroring
Chubby supports automatic file mirroring across cells -- copying collections of files from one cell to another.
| Mirroring property | Detail |
|---|---|
| Speed | Changes reflected in dozens of mirrors worldwide in under a second (files are small) |
| Trigger | Event mechanism notifies mirrors immediately on file add/delete/modify |
| Network partitions | Unreachable mirrors remain unchanged; on reconnection, checksums identify stale files |
| Primary use case | Distributing configuration files to computing clusters worldwide |
Global cell
A special global cell has replicas distributed across widely separated geographic locations. It mirrors a subtree (/ls/global/master) to every other Chubby cell (/ls/cell/replica).
The global cell stores:
- Chubby's own ACLs
- Files where Chubby cells and other systems advertise their presence to monitoring services
- Pointers to large data sets (e.g., BigTable cells) and configuration files for other systems