Kafka Introduction

Why Kafka matters

In 2010, LinkedIn had a problem that every fast-growing tech company eventually faces: data was everywhere, and nothing could keep up.

User activity events, application metrics, server logs, database changelogs -- all of this data needed to flow between dozens of internal systems. The existing solutions (traditional message queues, direct database-to-database transfers, custom ETL pipelines) were breaking under the load. They were too slow, too fragile, or couldn't handle the volume.

LinkedIn needed something fundamentally different: a system that could ingest millions of events per second, store them durably, and let multiple independent consumers read the same data stream at their own pace. Traditional message queues (like RabbitMQ) deleted messages after delivery. Databases weren't designed for high-throughput streaming. Nothing fit.

So LinkedIn built Kafka -- and in doing so, created an entirely new category of infrastructure: the distributed commit log.

Interview insight

Kafka appears in system design interviews constantly -- not always by name, but by pattern. Any time an interviewer asks you to design a system that needs to "decouple producers from consumers," "handle bursty traffic," or "enable real-time analytics alongside batch processing," they're describing Kafka's use case. Understanding Kafka gives you the vocabulary for all of these.

What is Kafka?

Apache Kafka is a distributed, durable, fault-tolerant streaming platform. At its core, it's a system that:

Takes in streams of messages from applications (producers)
Stores them reliably on a cluster of machines (brokers)
Delivers them to applications that process them (consumers)

What makes Kafka different from a traditional message queue:

Traditional message queue	Kafka
Messages deleted after consumption	Messages persisted on disk, retained for a configurable period
One consumer gets each message	Multiple consumer groups can independently read the same data
Optimized for message delivery	Optimized for throughput -- millions of messages per second
Push-based delivery	Pull-based -- consumers read at their own pace
No replay	Consumers can replay from any point in the stream

The commit log: Kafka's big idea

At a high level, Kafka is a distributed commit log (also known as a write-ahead log). This is an append-only data structure that stores a sequence of records:

Records are always appended to the end of the log
Once written, records are immutable -- they cannot be modified or deleted
Reading always happens sequentially, from old to new

This simplicity is Kafka's superpower. Because all writes are sequential appends and all reads are sequential scans, Kafka achieves remarkable throughput by exploiting sequential disk I/O -- which, on modern hardware, can be faster than random-access memory operations.

The performance insight

Kafka stores everything on disk, yet achieves throughput measured in millions of messages per second. How? Sequential disk I/O. A modern disk can write 600MB/s sequentially but only 100KB/s randomly. By constraining its data structure to append-only sequential access, Kafka turns disk from a bottleneck into an advantage. This is why Kafka can retain messages for days or weeks with negligible performance impact.

Kafka use cases

Use case	How Kafka fits	Example
Metrics & monitoring	Distributed services push operational metrics to Kafka topics; monitoring systems consume and aggregate	Datadog-style dashboards pulling from Kafka streams
Log aggregation	Collect logs from hundreds of services into a unified stream	Replacing scattered log files with a central, queryable log pipeline
Stream processing	Raw data flows through multiple transformation stages via topic-to-topic processing	Enriching clickstream data before loading into a data warehouse
Event sourcing	Use Kafka as the system of record -- every state change is an immutable event	Order lifecycle events: created → paid → shipped → delivered
Website activity tracking	Kafka's original use case at LinkedIn -- capture every page view, search, and click	Real-time analytics, A/B test measurement, recommendation engines
Change data capture	Database changes published to Kafka for downstream systems	Keeping a search index in sync with a primary database

Kafka vs. traditional messaging

Kafka isn't just a "faster RabbitMQ." It's a fundamentally different abstraction. A message queue is about delivering messages. Kafka is about storing a log of events that multiple systems can independently consume. This makes it the backbone for event-driven architectures.

What's next

In the following chapters, we'll explore:

Kafka's high-level architecture -- brokers, topics, partitions, and replication
A deep dive into how messages are stored and indexed
Consumer groups -- how multiple consumers coordinate
The workflow of producing and consuming messages
ZooKeeper's role in Kafka's coordination
Delivery semantics -- at-most-once, at-least-once, and exactly-once

Why Kafka matters​

What is Kafka?​

The commit log: Kafka's big idea​

Kafka use cases​

What's next​

Why Kafka matters

What is Kafka?

The commit log: Kafka's big idea

Kafka use cases

What's next