GFS and Chubby

BigTable does not implement its own storage layer or its own distributed coordination. Instead, it delegates these problems to two specialized systems: GFS for storage and Chubby for coordination. This layered design is a powerful architectural pattern -- each system solves one problem well.

Think first

BigTable needs durable, replicated storage and distributed coordination (master election, service discovery). Should it build these capabilities itself, or delegate them to specialized systems? What are the trade-offs?

GFS

GFS is Google's distributed file system, purpose-built for large data-intensive workloads.

Property	Detail
File structure	Files broken into fixed-size Chunks
Storage	Chunks stored on ChunkServers
Metadata	Managed by the GFS master
Replication	Each chunk replicated across multiple ChunkServers on different racks
Data path	Clients read/write directly to ChunkServers (metadata only from master)

In BigTable's context: SSTables are divided into blocks, and those blocks are stored as GFS chunks on ChunkServers. This means BigTable gets durable, replicated storage without managing replication itself.

For a detailed discussion, see GFS.

Chubby

Chubby is a highly available distributed lock service that keeps a multi-thousand-node BigTable cluster coordinated.

Property	Detail
Replicas	Typically five active replicas; one elected master
Consensus	Paxos algorithm for replica consistency
Interface	Namespace of files and directories, each usable as a lock
Atomicity	Read/write to a Chubby file is atomic
Sessions	Clients maintain sessions with lease-based expiration
Callbacks	Clients register for change or session-expiry notifications

warning

BigTable's dependency on Chubby is total. If Chubby is unavailable for an extended period, BigTable becomes unavailable too. This is a deliberate trade-off: BigTable gets strong coordination guarantees but inherits Chubby's availability characteristics.

What BigTable uses Chubby for

Use case	Mechanism
Master election	Master acquires and periodically renews a session lease in Chubby
Bootstrap location	The root of BigTable's metadata hierarchy is stored in a Chubby file
Tablet server discovery	New Tablet servers register in Chubby's "servers" directory
Schema storage	Column family information for each table lives in Chubby
Access control	ACLs stored as Chubby files

Interview angle

BigTable's reliance on Chubby illustrates the separation of concerns principle in distributed systems. Rather than embedding consensus into BigTable, Google built one lock service (Chubby) and reused it across BigTable, GFS, and MapReduce. When asked about coordination in a system design interview, propose a similar pattern: use a dedicated coordination service (ZooKeeper, etcd, Chubby) rather than baking consensus into every component.

Quiz

What would happen if Chubby experienced a prolonged outage (say, 30 minutes) while BigTable was serving production traffic?

BigTable would continue operating normally because data reads and writes go directly to tablet servers.

Only metadata operations (creating tables, adding column families) would be affected.

BigTable would become unavailable because it cannot perform master election, tablet server discovery, or tablet location lookups without Chubby -- this is a deliberate trade-off for strong coordination guarantees.

BigTable would automatically fall back to peer-to-peer coordination like Dynamo.

GFS​

Chubby​

What BigTable uses Chubby for​

GFS

Chubby

What BigTable uses Chubby for