Summary: HDFS

The big picture

HDFS is the open-source world's answer to GFS. It takes GFS's core architecture (single master, distributed storage nodes, large block sizes, replication) and adapts it for the Hadoop ecosystem, where the primary workload is MapReduce-style batch processing.

HDFS made a key simplifying trade-off compared to GFS: no concurrent writers. GFS supported multiple clients appending to the same file simultaneously (a complex feature). HDFS restricts to single-writer, append-only access. This dramatically simplified the consistency model and implementation, at the cost of limiting some use cases.

Understanding HDFS alongside GFS reveals what's essential in the distributed file system design (master/worker split, large blocks, replication) versus what's workload-specific (concurrent appends, relaxed consistency).

Architecture at a glance

Component	Role	GFS equivalent
NameNode	Single master; manages all metadata in memory	Master
DataNodes	Store 128MB blocks as local files	ChunkServers
Clients	Talk to NameNode for metadata, directly to DataNodes for data	Clients
EditLog	Write-ahead log for metadata changes	Operation log
FsImage	Periodic checkpoint of NameNode state	Checkpoint

How HDFS uses system design patterns

Problem	Pattern	How HDFS uses it
Surviving NameNode crashes	Write-ahead Log	All metadata changes written to EditLog before being applied
Monitoring DataNodes	Heartbeat	DataNodes send heartbeats every few seconds; missed heartbeats trigger re-replication
Detecting data corruption	Checksum	Checksums computed per 512 bytes; verified on reads, corrupt blocks re-replicated
Coordinating reads/writes	Leader and Follower	NameNode (leader) directs all metadata operations; DataNodes (followers) store data
Preventing split-brain	Split-brain + Fencing	ZooKeeper ensures one active NameNode; STONITH fences the old one
Write access control	Lease	Clients hold leases for file write access; expired leases free the file

HDFS vs. GFS

Aspect	GFS	HDFS
Block/chunk size	64MB	128MB
Concurrent appends	Yes (multiple writers)	No (single writer)
Random writes	Limited	None (append-only)
Consistency	Relaxed	Strong (single-writer simplification)
Master HA	Log replication + manual failover	ZooKeeper-based automatic failover
Open source	No	Yes

The NameNode problem (and the solution)

HDFS's biggest limitation is the single NameNode: all metadata lives in one node's memory, creating both a scalability ceiling (limited by RAM) and a single point of failure.

HDFS High Availability solves the failure problem:

An Active NameNode and a Standby NameNode run simultaneously
Both share access to the EditLog (via shared NFS or a quorum journal)
If the active fails, the standby takes over automatically
Fencing (STONITH) ensures the old active can't corrupt shared state

The scalability ceiling (number of files limited by NameNode memory) remains a fundamental limitation of HDFS's design. HDFS Federation partially addresses this by allowing multiple independent NameNodes, each managing a portion of the namespace.

Quick reference card

Property	Value
Type	Distributed file system
CAP classification	CP -- strongly consistent
Architecture	Single NameNode + multiple DataNodes
Block size	128MB (configurable)
Replication	3 replicas per block (configurable), rack-aware placement
Consistency	Strong; write-once, read-many; single writer per file
Write model	Append-only, no random writes
Metadata	In-memory on NameNode, persisted via EditLog + FsImage
Failure recovery	ZooKeeper-based HA with automatic failover
Open source	Yes (Apache Hadoop)

Design Challenge

Design a data lake for a social media platform

You need to design a data lake that stores user activity logs (clicks, views, posts) for a social media platform with 200 million daily active users. Logs are ingested continuously and processed by daily MapReduce analytics jobs. The system must scale to petabytes and tolerate rack-level failures.

Hints (0/4)

The big picture​

Architecture at a glance​

How HDFS uses system design patterns​

HDFS vs. GFS​

The NameNode problem (and the solution)​

Quick reference card​