Chubby Introduction

Goal

Design a highly available and consistent service that can store small objects and provide a locking mechanism on those objects.

Locking mechanism

A locking mechanism enforces exclusive access to a resource. For example, a "write" lock on a file lets one client modify it while preventing all others from reading or writing to it.

Why Chubby matters

Here's a scenario that happens constantly in distributed systems: you have five replicas of a database, and you need exactly one of them to be the leader. How do you decide which one? All five nodes need to agree, and they need to agree even when some nodes are down or the network is partitioned.

This is the distributed consensus problem, and it's surprisingly hard to solve correctly. The Paxos algorithm solves it in theory, but implementing Paxos directly into every application that needs consensus would be impractical -- it's complex, error-prone, and every team would end up building their own buggy version.

Google's insight: wrap consensus in a simple, familiar interface. Instead of making every application implement Paxos, give them a service that looks like a file system with locks. Need leader election? Everyone tries to acquire a lock -- whoever gets it first is the leader. Need to store configuration? Write it to a file. Need a naming service? Use the file hierarchy.

That's Chubby: a distributed lock and file service that hides Paxos behind a developer-friendly API. A few lines of code, and your application gets distributed consensus without needing to understand the underlying algorithm.

Interview insight

Chubby (and its open-source counterpart ZooKeeper) shows up in interviews as the answer to: "How do you coordinate distributed nodes?" Leader election, service discovery, configuration management, distributed locking -- Chubby handles all of these. If your design needs any form of coordination, you need a Chubby-like service.

What is Chubby?

Chubby is a distributed coordination service that provides:

A distributed locking mechanism for synchronizing activities across nodes
Small file storage for metadata, configuration, and ACLs
A consistent, highly available data store backed by Paxos consensus

Internally, it's a key/value store with locking on each object. Externally, it looks like a simple file system where you can create files, read them, and acquire locks on them.

Property	Detail
CAP classification	CP -- strong consistency guaranteed by Paxos
Architecture	Leader-based: one master serves all reads/writes, typically 5 replicas
Consistency	Linearizable -- reads always see the latest write
Data size	Small objects only (kilobytes, not megabytes)
Lock type	Coarse-grained (held for hours/days, not milliseconds)
Open-source equivalent	Apache ZooKeeper

Chubby use cases

Leader/master election

The most common use case. Multiple replicas of a service compete to acquire a Chubby lock. The winner becomes the leader. If the leader dies, it loses the lock, and another replica takes over. GFS and BigTable both use Chubby for leader election.

Naming service (replacing DNS)

DNS is slow to update due to TTL-based caching -- changes can take minutes to propagate. Chubby provides a consistent, instantly-updated namespace. At Google, Chubby has largely replaced DNS for internal service discovery.

Metadata storage

Chubby provides a Unix-style interface for storing small files that change infrequently. GFS and BigTable store their metadata in Chubby. Services use it for configuration files and ACLs.

Distributed locking

Coarse-grained locks (as opposed to fine-grained) for synchronizing distributed activities. A few lines of code give your application distributed mutexes and semaphores.

Coarse-grained vs. fine-grained locks

Coarse-grained: One lock on a large resource (a file, a database table). Held for long periods -- hours or days. This is what Chubby is designed for.
Fine-grained: Locks on small resources (a row, a record). Held for milliseconds. Chubby is not designed for this -- the overhead would be too high.

When NOT to use Chubby

Anti-pattern	Why
Bulk storage	Chubby is for kilobytes, not gigabytes -- use GFS/BigTable for large data
High update rates	Chubby is optimized for reads, not frequent writes
Frequent lock acquisition/release	Designed for coarse-grained locks held for long periods
Publish/subscribe patterns	Use Kafka or a message queue instead

Background: Chubby and Paxos

Paxos is the consensus algorithm at Chubby's core. All writes go through Paxos to ensure that a majority of replicas (typically 3 out of 5) agree before a write is committed. This is what gives Chubby its strong consistency guarantee.

Chubby isn't a research project or a novel algorithm -- it's an engineering product that wraps a well-known theoretical solution (Paxos) in a practical, developer-friendly interface. Its contribution is in the API design and the operational experience of running consensus at Google's scale.

What's next

In the following chapters, we'll explore:

Chubby's high-level architecture -- master, replicas, and cells
Design rationale -- why a lock service instead of a Paxos library
How Chubby works -- the mechanics of reads, writes, and master election
Files, directories, and handles
Locks, sequencers, and lock delays
Sessions and events -- how clients stay connected
Caching and scaling strategies

Goal​

Why Chubby matters​

What is Chubby?​

Chubby use cases​

Leader/master election​

Naming service (replacing DNS)​

Metadata storage​

Distributed locking​

When NOT to use Chubby​

Background: Chubby and Paxos​

What's next​