Locks Sequencers and Lock-delays

A distributed lock by itself is not enough. What happens when a lock holder crashes, the lock is reassigned, but the old holder's in-flight messages still arrive at downstream servers? This is the stale leader (or "zombie leader") problem, and Chubby solves it with two mechanisms: sequencers and lock-delay.

Think first

A lock holder crashes, a new client acquires the lock, but the old holder's in-flight messages still arrive at downstream servers. How would you prevent these stale messages from being executed?

Lock modes

Each Chubby node can act as a reader-writer lock in one of two modes:

Mode	Behavior
Exclusive (write)	Exactly one client holds the lock
Shared (read)	Any number of clients hold the lock simultaneously

Sequencers

After acquiring a lock, a client can request a sequencer -- an opaque byte string that captures the lock's state:

Sequencer = Lock name + Lock mode (exclusive or shared) + Lock generation number

The workflow:

Application master acquires a Chubby lock and obtains a sequencer.
Master attaches the sequencer to every command it sends to worker servers.
Workers validate the sequencer with Chubby before executing the command.
If the sequencer belongs to a stale master (lock generation is outdated), Chubby rejects it.

This is the same concept as Fencing tokens -- a monotonically increasing token that lets downstream services reject commands from superseded leaders.

Interview angle

When discussing leader election in interviews, always mention fencing. "What if the old leader doesn't know it's been replaced?" is a classic follow-up. The answer: every command carries a sequencer (fencing token). Downstream services check the token and reject stale commands. Reference Chubby's sequencer or ZooKeeper's zxid as concrete examples.

Lock-delay

Not all servers support sequencer validation. For these legacy systems, Chubby provides lock-delay: a grace period during which a freed lock cannot be re-acquired by a different client.

Scenario	Behavior
Normal release	Lock is immediately available to other clients
Holder fails or becomes unreachable	Lock server prevents others from claiming the lock for the lock-delay period

Key details:

Clients can specify any lock-delay up to an upper bound (default: one minute).
The upper bound prevents a faulty client from making a resource unavailable indefinitely.
Lock-delay is imperfect -- it relies on timing assumptions rather than logical ordering -- but it protects unmodified servers from everyday problems caused by message delays and restarts.

warning

Lock-delay is a best-effort safeguard. It handles common cases (message reordering, slow restarts) but cannot prevent all split-brain scenarios. If your system can support sequencer validation, always prefer sequencers over lock-delay. In interview answers, present sequencers as the primary solution and lock-delay as the fallback for legacy integration.

Quiz

What would happen if Chubby used lock-delay (time-based protection) as its only defense against zombie leaders, without sequencers?

The system would be equally safe because lock-delay prevents lock re-acquisition during the danger window.

Messages from the old lock holder that arrive after the lock-delay period expires would be incorrectly accepted, because downstream servers would have no way to distinguish commands from the old vs. new lock holder without a logical ordering mechanism.

Lock-delay would be sufficient because one minute is long enough for all messages to drain.

There would be no practical difference because zombie leader scenarios are extremely rare.

Lock modes​

Sequencers​

Lock-delay​

Lock modes

Sequencers

Lock-delay