Skip to main content

Bigtable Components

Who does what in a BigTable cluster? The system has a strict division of labor: one master handles metadata and coordination, while many Tablet servers handle all data reads and writes. Understanding this split -- and especially understanding that the master is not on the data path -- is key to understanding BigTable's scalability.

Think first
BigTable uses a single master server for coordination. How would you prevent this single master from becoming a bottleneck or single point of failure?

Component overview

ComponentCountResponsibilities
Client libraryOne per clientCommunicates with Tablet servers for data, master for metadata
Master serverOne per clusterTablet assignment, load balancing, GC, schema operations
Tablet serversMany (tens to thousands)Serve reads/writes for assigned Tablets

BigTable master server

The single master is responsible for:

  • Assigning Tablets to Tablet servers and rebalancing load
  • Monitoring Tablet server health (via Chubby sessions)
  • Garbage collecting underlying files in GFS
  • Handling metadata operations (table and column family creation/deletion)

Critically, the master does not participate in data reads or writes. Clients communicate directly with Tablet servers for all data operations. This design follows the Leader and Follower pattern -- the leader (master) coordinates, but followers (Tablet servers) serve data independently.

Interview angle

"Isn't a single master a bottleneck?" is a common interviewer challenge. The answer: no, because the master only handles metadata -- Tablet assignment, schema changes, and health monitoring. All data flows directly between clients and Tablet servers. The master's workload is proportional to the number of Tablets (metadata), not the volume of reads/writes (data). This is the same principle behind HDFS's NameNode and GFS's master.

Tablet servers

PropertyDetail
Tablets per serverTypically 10--1,000
ScalingServers can be added or removed dynamically
Data pathClients communicate directly with Tablet servers for reads/writes
Tablet splittingTablet servers split Tablets that grow too large and notify the master
Tablet creation/deletionInitiated by the master
warning

Tablet servers handle splitting, but the master initiates merging and deletion. If a Tablet server splits a Tablet, it notifies the master after the fact. This asymmetry exists because splitting is a local decision (the Tablet is too big), while merging requires a global view of the key space.

Quiz
What would happen if BigTable allowed tablet servers to merge tablets independently (without the master's involvement), the same way they can split them?