The Life of BigTables Read & Write Operations

What happens, step by step, when a client writes a row to BigTable? What about when it reads one? The write path prioritizes durability and speed (append-only log + in-memory buffer), while the read path prioritizes freshness (merge across MemTable and SSTables). Both paths bypass the master entirely.

Think first

When a tablet server receives a write, it must both persist the data durably and make it available for reads. In what order should it write to the commit log, update the in-memory buffer, and acknowledge the client -- and why does the order matter?

Write request

When a Tablet server receives a write:

Step	Action
1	Validate the request is well-formed
2	Authorize the sender via ACLs stored in Chubby
3	Append the mutation to the commit log in GFS (Write-Ahead Log pattern)
4	Insert the mutation into the in-memory MemTable
5	Acknowledge success to the client
6	Periodically flush MemTables to SSTables; merge SSTables during compaction

Interview angle

Notice the order: commit log first, then MemTable, then acknowledge. The client gets an ACK only after the mutation is durable on GFS. This guarantees that no acknowledged write is lost, even if the Tablet server crashes immediately after responding. This is the textbook Write-Ahead Log pattern -- a must-know for any storage system interview.

Read request

When a Tablet server receives a read:

Step	Action
1	Validate the request and authorize the sender
2	Check the cache (Scan Cache and Block Cache) for a hit
3	Read the MemTable for the most recent mutations
4	Consult SSTable indexes (loaded in memory) to identify relevant SSTables
5	Merge rows from MemTable and SSTables to produce the final result

Since both the MemTable and SSTables are sorted by key, the merge operation is efficient -- it behaves like a merge step in merge-sort.

warning

A read may need to touch every SSTable that makes up a Tablet if the key exists in multiple files (due to updates and compaction lag). This is the main cost of LSM-based storage. BigTable mitigates this with Bloom Filters (skip SSTables that definitely don't contain the key), caching (avoid repeated disk reads), and compaction (reduce the number of SSTables).

Quiz

What would happen if BigTable acknowledged writes to the client BEFORE appending to the commit log (reversing steps 3 and 5 in the write path)?

Write latency would improve with no downside since GFS already provides replication.

Acknowledged writes could be silently lost if the tablet server crashes between sending the ACK and writing to the commit log, violating durability guarantees and making the system unreliable for any application that depends on write confirmation.

The MemTable would serve as a backup, so no data would be lost.

Read performance would improve because the MemTable would be updated sooner.

Write request​

Read request​

Write request

Read request