Compaction

SSTables are immutable -- great for write performance, but it means updates and deletes create new entries rather than modifying existing ones. Over time, you accumulate many SSTables with redundant and obsolete data. Reads slow down because they must check all of them.

Compaction is the process of merging multiple SSTables into fewer, cleaner ones.

Think first

SSTables are immutable, so updates create new entries and deletes add tombstones. Over time, what problems does this cause for reads and disk usage?

What compaction does

During compaction:

Multiple SSTables are merged into a single new SSTable
Keys are deduplicated -- only the latest version is kept
Tombstones (delete markers) that have expired are removed
A new index is created over the merged data
The old SSTables are deleted

Benefits:

Fewer SSTables to scan during reads → faster reads
Obsolete data removed → disk space reclaimed
Fresher Bloom filters → more accurate filtering

Think first

Cassandra offers multiple compaction strategies. If you had a time-series workload (sensor data arriving in timestamp order, rarely updated, queried by time range), which strategy would you choose and why?

Compaction strategies

Strategy	Best for	How it works
Size-Tiered (STCS)	Write-heavy, general workloads	Triggers when multiple SSTables of similar size exist. Groups and merges them. Default strategy.
Leveled (LCS)	Read-heavy workloads	Organizes SSTables into levels, each 10x larger than the previous. Guarantees at most one SSTable per partition per level → predictable read performance.
Time-Window (TWCS)	Time-series data	Groups SSTables by time window. Compacts within windows. Ideal for data that's immutable after a time period and can be bulk-deleted by dropping entire windows.

Why writes are sequential

Every operation in Cassandra's write path -- commit log append, MemTable insert, SSTable flush -- is sequential I/O. No random seeks. No reads. This is the primary reason writes are so fast.

Compaction is the deferred cost: it reorganizes data in the background using sequential I/O. You pay for the reorganization eventually, but you never pay for it during the initial write. This amortization is why Cassandra can sustain hundreds of thousands of writes per second.

Interview angle

When asked "How does Cassandra achieve such high write throughput?", the answer is: all writes are sequential appends (commit log + SSTable flush), and compaction amortizes the reorganization cost in the background. The trade-off: reads are more complex (must merge across MemTable + multiple SSTables) and compaction consumes background I/O.

Quiz

You switch from Size-Tiered Compaction (STCS) to Leveled Compaction (LCS) on a write-heavy table. What trade-off should you expect?

Read latency improves because LCS limits SSTables per partition, but write amplification increases because data is rewritten more frequently across levels.

Both read and write performance improve because LCS is strictly better than STCS.

Disk usage decreases because LCS compacts more aggressively.

Read latency stays the same because the number of SSTables does not change.

What compaction does​

Compaction strategies​

Why writes are sequential​

What compaction does

Compaction strategies

Why writes are sequential