13 Fencing
You've detected a split-brain situation -- a zombie leader is still running after a new leader was elected. Generation numbers ensure nodes will eventually reject the zombie's commands. But what about the window between the new leader taking over and the zombie learning it's been replaced? During that window, the zombie might still issue writes, modify shared storage, or corrupt data.
Fencing ensures the zombie can't do any damage during this dangerous transition period.
Background
In a leader-follower setup, detecting that the old leader is stale (via generation numbers) isn't enough. Consider this timeline:
- Leader A (generation 1) is serving writes to shared storage
- Leader A becomes unresponsive (network partition or GC pause)
- The cluster elects Leader B (generation 2)
- Leader B starts writing to shared storage
- Leader A comes back, doesn't yet know it's been replaced, and writes to shared storage
Between steps 4 and 5, both leaders are writing to the same shared storage. Even though generation numbers will eventually resolve this, the concurrent writes during the overlap window can cause corruption.
Fencing closes this window by proactively blocking the old leader's access to shared resources.
Definition
Fencing puts a "fence" around the previously active leader, preventing it from accessing cluster resources and serving any read/write requests.
Fencing techniques
Resource fencing
Block the old leader's access to the specific resources it needs:
- Revoke shared storage access -- Change NFS permissions or storage ACLs so the old leader's credentials no longer work
- Disable network ports -- Use remote management commands to block the old leader's network access to critical services
- Invalidate tokens -- If the storage system uses tokens/leases, revoke the old leader's token
The advantage: surgical and targeted. The old leader node stays running (useful for debugging) but can't affect shared state.
Node fencing (STONITH)
Stop the old leader entirely:
- Power off the node -- Via IPMI, iLO, or cloud provider API
- Force restart -- Hard reset the machine
- Kill the process -- If the node is a VM or container, terminate it
This is the nuclear option, known as STONITH -- "Shoot The Other Node In The Head." It's aggressive but unambiguous: a powered-off node definitely can't issue conflicting commands.
| Technique | Precision | Certainty | Recovery |
|---|---|---|---|
| Resource fencing | High (blocks specific access) | Medium (process still runs, might find another path) | Node stays up, can be re-added |
| Node fencing (STONITH) | Low (kills everything) | High (node is definitely stopped) | Requires full restart |
Resource fencing is preferred when you can reliably enumerate all the resources the old leader needs. Node fencing is the fallback when you can't be sure you've blocked every path -- when in doubt, kill the node.
How fencing works with generation numbers
Fencing and split-brain detection are complementary:
- Generation numbers ensure that followers reject stale commands -- they solve the problem from the recipient side
- Fencing ensures the zombie leader can't issue commands -- it solves the problem from the sender side
- Together, they close both sides of the window: even if fencing is slightly delayed, generation numbers provide a safety net, and vice versa
Examples
HDFS
HDFS is the textbook example of fencing in production. When a standby NameNode takes over as the active NameNode:
- It uses STONITH to fence the previously active NameNode -- typically by SSHing to the machine and killing the NameNode process
- It revokes the old NameNode's access to the shared edit log storage (resource fencing)
- Only then does it begin serving as the new active NameNode
Without fencing, the old NameNode could write to the edit log concurrently with the new one, corrupting the file system's metadata.
Chubby / ZooKeeper
Chubby's lease mechanism is a form of time-based fencing. When a session lease expires, all locks held by that session are automatically released. The old leader loses its locks and therefore can't access the resources those locks protected. This is "soft fencing" -- it relies on all participants respecting the lease protocol.
Fencing comes up as a follow-up to split-brain: "OK, you have generation numbers. But what about the window before the zombie knows it's been replaced?" The answer: fence the old leader -- either block its resource access or kill the node. Always mention both the generation number (logical safety) and fencing (physical safety) as complementary defenses.