Summary: Kafka

The big picture

Kafka's genius is its simplicity. At its core, it's an append-only, distributed commit log -- and by constraining its data structure to sequential appends and reads, it achieves throughput that seems impossible for a disk-based system. Millions of messages per second, stored durably, replayable by any number of consumers at any time.

Kafka didn't just replace traditional message queues -- it created a new category: the event streaming platform. Instead of "deliver this message and forget it," Kafka says "store this event permanently and let anyone who cares read it at their own pace." This single shift enables architectures (event sourcing, CQRS, change data capture) that are impossible with traditional messaging.

Key concepts at a glance

Concept	What it is	Why it matters
Topic	A named stream of messages	Logical organization -- producers write to topics, consumers read from them
Partition	A subset of a topic, stored on one broker	The unit of parallelism -- more partitions = more throughput
Broker	A Kafka server	Stores partitions on disk, serves reads and writes
Replica	A copy of a partition on another broker	Fault tolerance -- if a broker dies, replicas take over
ISR (In-Sync Replicas)	Replicas that are caught up with the leader	Only ISR members can become the new leader
Consumer group	A set of consumers that share the work of reading a topic	Each partition is consumed by exactly one member of the group
Offset	A message's position in a partition	Consumers track offsets to know where they are in the stream

How Kafka uses system design patterns

Problem	Pattern	How Kafka uses it
Durable message storage	Write-ahead Log	Kafka is a distributed write-ahead log -- every message is durably appended
Managing log size	Segmented Log	Partitions are split into segments for efficient purging and lookup
Tracking replication progress	High-Water Mark	Consumers only see messages up to the high-water mark (committed to all ISRs)
Partition leadership	Leader and Follower	Each partition has one leader that handles all reads/writes
Preventing zombie controllers	Split-brain (Epoch number)	Controller epoch prevents stale controllers from issuing commands
Verifying data integrity	Checksum	CRC32 in each message record, verified by brokers and consumers

Kafka's delivery semantics

Guarantee	How it works	When to use
At-most-once	Consumer commits offset before processing. If it crashes, the message is skipped.	Low-value messages where losing some is acceptable (metrics, logs)
At-least-once	Consumer commits offset after processing. If it crashes, the message is reprocessed.	Most use cases -- idempotent consumers handle duplicates
Exactly-once	Idempotent producers + transactional writes. Kafka guarantees each message is processed exactly once.	Financial transactions, inventory counts -- anything where duplicates or losses are unacceptable

Interview angle

When asked "How does Kafka guarantee exactly-once delivery?", the answer involves three mechanisms: idempotent producers (each message gets a sequence number, broker deduplicates), transactions (atomic writes across multiple partitions), and consumer offset management (offsets committed atomically with processing results).

Quick reference card

Property	Value
Type	Distributed streaming platform / commit log
CAP classification	CP (within each partition)
Consistency	Strong consistency per-partition via ISR
Data model	Topics → Partitions → ordered log of messages
Partitioning	Topic-level, producer-configured (key-based or round-robin)
Replication	Leader + in-sync replicas (ISR)
Ordering guarantee	Per-partition only (not across partitions)
Storage	Segmented log files on disk, retention-based cleanup
Coordination	ZooKeeper (or KRaft in newer versions)
Open source	Yes (Apache)

Design Challenge

Design a real-time fraud detection pipeline

You need to design a pipeline that processes payment events from an e-commerce platform, detects anomalies within 100ms of each transaction, supports replaying historical events to retrain fraud models, and handles 500,000 events per second at peak.

Hints (0/4)

The big picture​

Key concepts at a glance​

How Kafka uses system design patterns​

Kafka's delivery semantics​

Quick reference card​

Design a real-time fraud detection pipeline

References and further reading​

The big picture

Key concepts at a glance

How Kafka uses system design patterns

Kafka's delivery semantics

Quick reference card

References and further reading