Skip to main content

7 High-Water Mark

A leader accepts a write and logs it. It sends the write to 3 followers. Two confirm, but the third is slow. Then the leader crashes. The new leader (promoted from a follower) might not have the latest entries. Clients who read those entries from the old leader now get "data not found" from the new one.

How do you prevent clients from seeing data that might disappear if the leader fails?

Think first
A leader accepts a write, logs it, and replicates it to some followers. The leader crashes before all followers confirm. A client had read that entry from the leader. The new leader doesn't have it. How do you prevent clients from seeing data that might disappear?

Background

In a leader-follower setup, the leader writes to its write-ahead log and replicates entries to followers. But replication is asynchronous -- followers may lag behind the leader. If the leader crashes, some recent entries might only exist on the leader and are lost.

This creates a dangerous situation: a client reads the latest entry from the leader, the leader crashes, a new leader takes over without that entry, and the client's next read returns different (older) data. This is called a non-repeatable read -- the same query returns different results depending on when you ask.

Non-repeatable read

A non-repeatable read occurs when a client reads the same data twice and gets different results. In a replicated system, this happens when the client reads uncommitted data from a leader that subsequently fails.

Definition

The high-water mark is the index of the most recent log entry that has been successfully replicated to a quorum of followers. Data is only exposed to clients up to this point. Anything beyond the high-water mark is "uncommitted" and invisible to clients.

How it works

  1. Leader appends an entry to its WAL (index = 10)
  2. Leader sends the entry to all followers
  3. Followers append to their WALs and acknowledge
  4. When a quorum of followers confirms entry 10, the leader advances the high-water mark to 10
  5. The leader shares the high-water mark with followers via heartbeat messages
  6. Clients can only read up to the high-water mark (index 10)

If the leader crashes and a new leader is elected, the new leader is guaranteed to have at least all entries up to the high-water mark (because a quorum had confirmed them). Data beyond the high-water mark may be lost -- but clients never saw it, so there's no inconsistency.

Without high-water markWith high-water mark
Clients see the latest data the leader hasClients only see data confirmed by a quorum
Leader failure → data inconsistency (non-repeatable reads)Leader failure → no data loss from the client's perspective
Faster reads (no replication delay)Slight read lag (waiting for quorum confirmation)
The key insight

The high-water mark creates a contract: "Everything below this point is safe -- even if the leader crashes, it won't disappear." Everything above it is speculative and might be lost. By hiding speculative data from clients, the system guarantees that leader failover is invisible to the application.

Examples

Kafka

This is Kafka's signature use of the pattern. Kafka brokers track the high-water mark as the largest offset that all In-Sync Replicas (ISRs) share. Consumers can only read messages up to the high-water mark. This means:

  • Producers write to the leader, which replicates to ISRs
  • Once all ISRs confirm, the high-water mark advances
  • Consumers see messages only after they're committed across all ISRs
  • If the leader fails, the new leader (an ISR member) has all committed messages
Interview angle

High-water mark comes up in any design that involves replication. The question to answer: "How do you prevent clients from reading data that might be lost during failover?" The answer: track how far replication has progressed (the high-water mark) and only expose data up to that point. This pattern is used in Kafka, Raft, and many other replicated systems.

Quiz
In Kafka, consumers can only read up to the high-water mark. What would happen if Kafka allowed consumers to read beyond it (up to the leader's latest entry)?