Skip to main content

Summary: Kafka

The big picture

Kafka's genius is its simplicity. At its core, it's an append-only, distributed commit log -- and by constraining its data structure to sequential appends and reads, it achieves throughput that seems impossible for a disk-based system. Millions of messages per second, stored durably, replayable by any number of consumers at any time.

Kafka didn't just replace traditional message queues -- it created a new category: the event streaming platform. Instead of "deliver this message and forget it," Kafka says "store this event permanently and let anyone who cares read it at their own pace." This single shift enables architectures (event sourcing, CQRS, change data capture) that are impossible with traditional messaging.

Key concepts at a glance

ConceptWhat it isWhy it matters
TopicA named stream of messagesLogical organization -- producers write to topics, consumers read from them
PartitionA subset of a topic, stored on one brokerThe unit of parallelism -- more partitions = more throughput
BrokerA Kafka serverStores partitions on disk, serves reads and writes
ReplicaA copy of a partition on another brokerFault tolerance -- if a broker dies, replicas take over
ISR (In-Sync Replicas)Replicas that are caught up with the leaderOnly ISR members can become the new leader
Consumer groupA set of consumers that share the work of reading a topicEach partition is consumed by exactly one member of the group
OffsetA message's position in a partitionConsumers track offsets to know where they are in the stream

How Kafka uses system design patterns

ProblemPatternHow Kafka uses it
Durable message storageWrite-ahead LogKafka is a distributed write-ahead log -- every message is durably appended
Managing log sizeSegmented LogPartitions are split into segments for efficient purging and lookup
Tracking replication progressHigh-Water MarkConsumers only see messages up to the high-water mark (committed to all ISRs)
Partition leadershipLeader and FollowerEach partition has one leader that handles all reads/writes
Preventing zombie controllersSplit-brain (Epoch number)Controller epoch prevents stale controllers from issuing commands
Verifying data integrityChecksumCRC32 in each message record, verified by brokers and consumers

Kafka's delivery semantics

GuaranteeHow it worksWhen to use
At-most-onceConsumer commits offset before processing. If it crashes, the message is skipped.Low-value messages where losing some is acceptable (metrics, logs)
At-least-onceConsumer commits offset after processing. If it crashes, the message is reprocessed.Most use cases -- idempotent consumers handle duplicates
Exactly-onceIdempotent producers + transactional writes. Kafka guarantees each message is processed exactly once.Financial transactions, inventory counts -- anything where duplicates or losses are unacceptable
Interview angle

When asked "How does Kafka guarantee exactly-once delivery?", the answer involves three mechanisms: idempotent producers (each message gets a sequence number, broker deduplicates), transactions (atomic writes across multiple partitions), and consumer offset management (offsets committed atomically with processing results).

Quick reference card

PropertyValue
TypeDistributed streaming platform / commit log
CAP classificationCP (within each partition)
ConsistencyStrong consistency per-partition via ISR
Data modelTopics → Partitions → ordered log of messages
PartitioningTopic-level, producer-configured (key-based or round-robin)
ReplicationLeader + in-sync replicas (ISR)
Ordering guaranteePer-partition only (not across partitions)
StorageSegmented log files on disk, retention-based cleanup
CoordinationZooKeeper (or KRaft in newer versions)
Open sourceYes (Apache)
Design Challenge

Design a real-time fraud detection pipeline

You need to design a pipeline that processes payment events from an e-commerce platform, detects anomalies within 100ms of each transaction, supports replaying historical events to retrain fraud models, and handles 500,000 events per second at peak.
Hints (0/4)

References and further reading