Anatomy of a Read Operation

When a client wants to read a file, how does it find the right data across a cluster of hundreds of ChunkServers? The answer involves exactly one metadata lookup and then direct data transfer -- the master is touched briefly and then stays out of the way.

Think first

When a client wants to read bytes at a specific offset in a file, what information does it need, and what is the minimum number of network round-trips required?

Step-by-step read flow

Step	Actor	Action
1	Client	Translates the file name + byte offset into a chunk index (trivial math given the fixed 64MB chunk size)
2	Client -> Master	Sends an RPC with the file name and chunk index
3	Master -> Client	Returns the chunk handle and locations of all replicas holding that chunk
4	Client	Caches the metadata (keyed by file name + chunk index) for future reads
5	Client -> ChunkServer	Sends a read request to the closest replica, specifying the chunk handle and byte range
6	ChunkServer -> Client	Returns the requested data

Key optimizations

Batch metadata requests: The client typically requests metadata for multiple chunks in a single RPC. The master also proactively includes information for chunks immediately following the requested ones, anticipating sequential reads.
Metadata caching: Cached chunk locations eliminate repeated master lookups. The cache uses timeouts to avoid serving stale location data.
Closest replica selection: The client picks the nearest ChunkServer (by network topology), minimizing read latency.
No data caching: Neither clients nor ChunkServers cache file data at the application level. The OS buffer cache on ChunkServers handles frequently accessed data. This avoids cache coherence complexity entirely.

Interview angle

The GFS read path is a textbook example of separating control flow from data flow. The master handles the lightweight "where is my data?" question, then the heavy "give me the bytes" transfer happens directly between client and ChunkServer. This pattern appears in virtually every distributed storage system -- HDFS NameNode works the same way, and even object stores like S3 separate metadata routing from data transfer. Always call out this separation when designing storage systems in interviews.

warning

The read path has no strong consistency guarantee. A client could read from a replica that is slightly stale (if it missed a recent mutation and the client has a cached location). For GFS's append-heavy workloads, this typically means the client sees a premature end-of-chunk rather than corrupted data -- but applications must tolerate this.

Quiz

What would happen if the GFS client did NOT cache chunk location metadata and instead asked the master for every read?

Reads would be more consistent because the client always gets the latest replica locations

The master would become a severe bottleneck under heavy read workloads, since every data read would require a metadata RPC to the master first

ChunkServers would receive fewer read requests

The system would be simpler because there is no cache invalidation logic

Step-by-step read flow​

Key optimizations​

Step-by-step read flow

Key optimizations