Summary: Cassandra

The big picture

Cassandra is a hybrid: Dynamo's architecture powering BigTable's data model. It takes the peer-to-peer, leaderless replication strategy from Dynamo and combines it with the column-family storage model and SSTable-based storage engine from BigTable. The result is a system that offers the best of both worlds -- decentralized scalability with rich, structured data access.

What makes Cassandra distinctive is its tunable consistency. Unlike Dynamo (which defaults to eventual consistency) or BigTable (which enforces strong consistency), Cassandra lets you choose per-query where you land on the consistency-availability spectrum. This flexibility is why it's adopted so widely -- different parts of the same application can make different trade-offs.

How Cassandra uses system design patterns

Problem	Pattern	How Cassandra uses it
Distributing data across nodes	Consistent Hashing	Ring topology with virtual nodes for even distribution
Ensuring write durability	Write-ahead Log	Every write goes to the commit log before the memtable
Managing log size	Segmented Log	Commit log is split into segments, truncated after flush to SSTables
Tuning read/write consistency	Quorum	Configurable R, W, and N per query
Spreading cluster state	Gossip Protocol	Every second, nodes gossip about membership, load, and schema
Detecting node failures	Phi Accrual Failure Detection	Adaptive detection that learns from network conditions
Handling temporary failures	Hinted Handoff	Healthy nodes store writes for downed nodes
Repairing stale replicas	Read Repair	Stale replicas updated during read operations
Avoiding unnecessary disk reads	Bloom Filters	Each SSTable has a Bloom filter to skip non-matching lookups
Distinguishing pre/post restart state	Split-brain (Generation clock)	Generation number incremented on restart, included in gossip

Cassandra's DNA: what it took from each parent

Component	From Dynamo	From BigTable
Architecture	Peer-to-peer, no leader
Partitioning	Consistent hashing + vnodes
Replication	Quorum-based
Failure detection	Gossip protocol
Hinted handoff	✔
Data model		Column families, sparse rows
Storage engine		MemTable → SSTable flush
On-disk format		SSTables with Bloom filters
Compaction		Merge SSTables to reclaim space

What Cassandra dropped

Cassandra uses last-write-wins instead of vector clocks. This means concurrent writes to the same key silently discard the "loser" based on timestamps. Simpler API, but silent data loss is possible. For Cassandra's typical workloads (time-series, event logs), this is acceptable.

Quick reference card

Property	Value
Type	Wide-column NoSQL database
CAP classification	AP (tunable toward CP)
Consistency model	Tunable -- per-query consistency levels
Data model	Row key → column families → columns (sparse)
Partitioning	Consistent hashing with virtual nodes
Replication	Configurable replication factor and consistency level
Conflict resolution	Last-write-wins (timestamp-based)
Failure detection	Gossip + Phi Accrual Failure Detector
Storage engine	MemTable → SSTable (log-structured merge)
Open source	Yes (Apache)

Design Challenge

Design a time-series metrics store for IoT

You need to design a time-series metrics store for an IoT platform with 10,000 sensors, each reporting every second -- producing 1 million writes per second. Queries retrieve metrics by sensor ID and time range. The system must tolerate full data center failures without downtime.

Hints (0/4)

The big picture​

How Cassandra uses system design patterns​

Cassandra's DNA: what it took from each parent​

Quick reference card​

Design a time-series metrics store for IoT

References and further reading​

The big picture

How Cassandra uses system design patterns

Cassandra's DNA: what it took from each parent

Quick reference card

References and further reading