Google File System Introduction

Goal

Design a distributed file system to store huge files (terabytes and larger). The system should be scalable, reliable, and highly available.

Why GFS matters

In the early 2000s, Google was crawling the entire internet and building a search index over it. The data volumes were staggering -- petabytes of web pages, constantly being recrawled and reprocessed. No commercial file system could handle it, and buying a proprietary solution at this scale would have been astronomically expensive.

So Google did what Google does: they built their own. But GFS wasn't just a bigger file system. Google studied their actual workloads and made deliberate, unconventional design choices that violated traditional file system assumptions:

Files are enormous (multi-GB), not small
Reads are mostly large and sequential, not small and random
Files are written once and appended to, rarely modified in place
Hardware failure is the norm, not the exception -- when you have thousands of commodity machines, something is always broken

These observations led to a design that looks nothing like a traditional file system -- and that's exactly why it worked.

Interview insight

GFS is the gold standard case study for designing around your workload. In an interview, when you're asked to design a storage system, the first thing to ask is: "What does the access pattern look like?" GFS shows how radically different the design becomes when you optimize for large sequential I/O instead of small random access.

What is GFS?

GFS is a scalable distributed file system built by Google for large, data-intensive applications. It runs on thousands of commodity machines, tolerates frequent hardware failures, and delivers high aggregate throughput to large numbers of clients.

Design principle	What it means	Why
Large chunk size (64MB)	Files are split into 64MB chunks, not 4KB blocks	Reduces metadata overhead; aligns with large sequential I/O pattern
Single master	One master node manages all metadata	Simplifies coordination; master doesn't handle data flow
Replication (3x)	Every chunk is stored on 3 different machines	Hardware failure is constant; 3 copies ensures durability
Append-optimized	Concurrent appends are a first-class operation	Google's workloads produce data streams, not random edits
Relaxed consistency	Some operations have weaker guarantees	Simplifies design; applications handle edge cases

GFS use cases

Web crawling and indexing -- GFS was originally built to store data from Google's web crawler and serve it to the indexing pipeline
BigTable storage -- BigTable uses GFS as its underlying storage layer for log and data files
Large-scale data processing -- Gmail, YouTube, and Google Earth all use GFS for bulk data storage
MapReduce jobs -- GFS is the storage substrate for Google's MapReduce framework

APIs

GFS does not provide standard POSIX-like APIs -- another deliberate choice that freed the designers from legacy constraints. Instead, it exposes user-level APIs:

Operation	Description
`create`	Create a new file
`delete`	Delete a file
`open`	Open a file, return a handle
`close`	Close a file handle
`read`	Read data from a file at a given offset
`write`	Write data to a file at a given offset

Plus two special operations that reflect GFS's unique design priorities:

Snapshot -- Efficiently copy a file or directory tree. Used for checkpointing and branching large datasets.
Record Append -- Allows multiple clients to append data to the same file concurrently while guaranteeing atomicity. This is the operation GFS is most heavily optimized for -- it powers producer-consumer queues and multi-way merge results without requiring external locking.

Why no POSIX?

POSIX compliance would have forced GFS to support operations (like random writes, hard links, file locking semantics) that Google's workloads don't need. By dropping POSIX, GFS gained the freedom to optimize purely for its actual access patterns. The lesson: don't pay for abstractions you won't use.

What's next

In the following chapters, we'll explore:

GFS's high-level architecture -- master, chunkservers, and clients
The single master design and why it works (despite being a potential bottleneck)
How metadata is managed and kept in memory
The anatomy of read, write, and append operations
GFS's consistency model and its deliberate relaxations
Fault tolerance -- what happens when things break

Goal​

Why GFS matters​

What is GFS?​

GFS use cases​

APIs​

What's next​