Summary: BigTable

The big picture

BigTable is Google's answer to a specific problem: how do you store petabytes of structured data and serve reads in milliseconds, across thousands of machines? The answer: build a wide-column store on top of two existing infrastructure layers -- GFS for durable storage and Chubby for coordination.

What makes BigTable instructive is its layered architecture. It doesn't reinvent file storage or distributed consensus -- it delegates those problems to GFS and Chubby, respectively, and focuses on what it does uniquely well: providing a structured data model with fast random reads over massive datasets. This is a powerful design principle: build on existing infrastructure rather than building everything from scratch.

Architecture at a glance

Component	Role	Depends on
Master	Assigns tablets to tablet servers, monitors load, handles schema changes	Chubby (for master election)
Tablet servers	Serve reads/writes for their assigned tablets	GFS (for SSTable storage)
Chubby	Master election, schema storage, tablet server discovery, access control	Paxos (internal)
GFS	Stores SSTables and commit logs durably	ChunkServers

The data path

Write path:

Write goes to the tablet server's commit log (write-ahead log on GFS)
Data is inserted into an in-memory MemTable (sorted by key)
When the MemTable reaches a size threshold, it's flushed to GFS as an immutable SSTable

Read path:

Check the MemTable first (most recent data)
Check Bloom filters on SSTables to skip files that definitely don't contain the key
Read matching SSTables from GFS (or from cache)
Merge results across all sources

How BigTable uses system design patterns

Problem	Pattern	How BigTable uses it
Surviving tablet server crashes	Write-ahead Log	Commit log stored on GFS; replayed during recovery
Monitoring tablet servers	Heartbeat	Master monitors tablet server health via Chubby sessions
Coordinating the cluster	Leader and Follower	Single master assigns and balances tablets across tablet servers
Avoiding unnecessary disk reads	Bloom Filters	Per-SSTable Bloom filters skip files that don't contain the target row
Verifying data integrity	Checksum	SSTable blocks are checksummed to detect corruption
Master election and discovery	Lease (via Chubby)	Chubby sessions with time-bound leases for tablet server registration

BigTable vs. Cassandra: same model, opposite architectures

Dimension	BigTable	Cassandra
Data model	Wide-column (column families)	Wide-column (column families)
Architecture	Single master (centralized)	Peer-to-peer (decentralized)
Consistency	Strong (CP)	Tunable (AP by default)
Partitioning	Range-based (tablets)	Consistent hashing (vnodes)
Coordination	Chubby (Paxos)	Gossip protocol
Conflict resolution	N/A (strong consistency)	Last-write-wins

Interview insight

This comparison is gold for interviews. If asked "How would you design a wide-column store?", you can present both approaches and discuss the trade-offs: BigTable's master simplifies consistency but creates a potential bottleneck. Cassandra's peer-to-peer design scales better but makes consistency harder. Neither is "better" -- they optimize for different requirements.

Quick reference card

Property	Value
Type	Wide-column NoSQL store
CAP classification	CP -- strongly consistent
Data model	`(row key, column family:qualifier, timestamp) → value`
Partitioning	Range-based tablet splitting
Storage engine	MemTable → SSTable (log-structured merge tree)
Underlying storage	GFS (SSTables stored as GFS files)
Coordination	Chubby (master election, schema, discovery)
Atomicity	Per-row (no cross-row transactions)
Open source	No (HBase is the open-source equivalent)

Design Challenge

Design a web crawler's URL database

You need to design a database for a web crawler that stores billions of URLs. For each URL, you need to store the page content, metadata (title, language, status code), and a history of past crawls. The system must support fast lookups by URL for deduplication and batch analytics for computing page rank.

Hints (0/4)

References and further reading

BigTable paper -- the original 2006 paper
SSTable and LSM Tree internals
Apache HBase -- open-source BigTable clone
Dynamo paper -- the other major influence on Cassandra

The big picture​

Architecture at a glance​

The data path​

How BigTable uses system design patterns​

BigTable vs. Cassandra: same model, opposite architectures​

Quick reference card​