Skip to main content

Messaging Systems Introduction

Design a distributed messaging system that can reliably transfer a high throughput of messages between different entities.

The problem

Imagine a log aggregation service receiving hundreds of log entries per second from dozens of microservices. Three challenges immediately arise:

  1. Traffic spikes: The service handles 500 messages/second normally. What happens during a deployment when it spikes to 5,000? How do you buffer without losing data?
  2. Coupling: Every producer and consumer must agree on protocols and data formats. Adding a new consumer means modifying producers. This is a tightly coupled nightmare.
  3. Failures: What happens to in-flight messages when the service goes down?

A messaging system solves all three by sitting between producers and consumers -- buffering traffic spikes, decoupling components, and persisting messages until they're safely consumed.

Think first
Before reading on, consider: if you had dozens of microservices that all needed to communicate, why not just use direct HTTP calls between them? What problems would arise at scale?

What is a messaging system?

A messaging system transfers data between services asynchronously. Producers send messages without knowing (or caring) who consumes them. Consumers process messages at their own pace. This decoupling is the key architectural benefit.

Two messaging models

Queue (point-to-point)

Each message is consumed by exactly one consumer. Once consumed, it's removed from the queue. Great for distributing work across multiple workers, but multiple consumers can't read the same message.

Think first
You just saw the queue model where each message goes to exactly one consumer. What if you need multiple independent services to each receive every message -- for example, an order event that needs to update inventory, send a confirmation email, and trigger analytics? How would you modify the queue model?

Publish-subscribe (pub-sub)

Messages are organized into topics. Publishers send to a topic; all subscribers to that topic receive every message. Multiple consumers can independently read the same messages.

The messaging system (the broker) stores messages, decouples publishers from subscribers, and provides fault tolerance by persisting messages until consumed.

Why Kafka is different

Most messaging systems implement either queue or pub-sub. Kafka implements both through consumer groups: a single consumer in a group gets queue behavior, multiple groups get pub-sub behavior. And unlike traditional brokers, Kafka retains messages after consumption -- consumers can replay from any point.

Why use a messaging system?

BenefitHow
Traffic bufferingAbsorb spikes by queuing messages until consumers catch up
Guaranteed deliveryPersist messages so they survive producer/consumer failures
Architectural decouplingProducers and consumers don't need to know about each other
ScalabilityAdd consumers to increase throughput without touching producers
Quiz
What would happen if you used a pure queue model for an e-commerce system where order events need to be processed by both the shipping service and the billing service?