Single Master and Large Chunk Size
Two of GFS's most consequential design decisions -- a single master and a 64MB chunk size -- were controversial even at the time. Both trade flexibility for simplicity, and understanding why those tradeoffs work reveals a lot about practical distributed systems design.
Single master
A single master simplifies everything: it has a global view of the cluster and can make optimal chunk placement and replication decisions without distributed consensus. The risk is obvious -- it could become a bottleneck.
GFS mitigates this by keeping the master off the data path. Clients never read or write file data through the master. The interaction looks like this:
- Client asks the master: "Which ChunkServers hold chunk X?"
- Master replies with ChunkServer locations
- Client caches this metadata (with a timeout) and talks directly to ChunkServers for all subsequent data operations
The master handles only metadata -- the lightweight control plane. All heavy data transfer happens on the separate data plane between clients and ChunkServers.
If an interviewer asks "Why not use multiple masters?", the answer is: GFS chose simplicity. A single master avoids the need for distributed consensus (like Paxos or Raft) on metadata operations. The tradeoff is a potential bottleneck, which Google eventually hit at scale -- leading to Colossus. In an interview, acknowledge the tradeoff and explain how you'd mitigate it (caching, batching, keeping the master off the data path).
Chunk size
64MB per chunk is roughly 16,000x larger than a typical filesystem block (4KB). This choice has cascading benefits:
| Benefit | Explanation |
|---|---|
| Less metadata | Fewer chunks per file means the master stores less data, keeping everything in memory |
| Fewer master interactions | Clients cache chunk locations; with large chunks, a single lookup covers more data |
| Long-lived TCP connections | A client reading a large chunk can reuse the same connection, amortizing TCP setup cost |
| Simpler ChunkServer management | Fewer chunks to track means easier capacity monitoring and load balancing |
| Efficient sequential I/O | Large chunks align perfectly with the large, sequential reads and appends that dominate GFS workloads |
Lazy space allocation
GFS does not allocate the full 64MB on disk when creating a chunk. Instead, the ChunkServer lazily extends the chunk as the client appends data. This avoids internal fragmentation -- if a chunk only holds 20MB of data, only 20MB of disk is consumed.
Large chunk size creates a hotspot problem for small files. A file smaller than 64MB occupies a single chunk, and if many clients read that file, the ChunkServers hosting that one chunk become overloaded. GFS handles this with two workarounds:
- Store small, popular files with a higher replication factor
- Add a random delay to application start times to stagger access
This hotspot problem is worth mentioning in interviews -- it shows you understand that design decisions have downsides.