Coordination Is Hard
“Consensus,” said Leslie Lamport when he introduced the PAXOS algorithm in 1989, “is the problem of getting a group of unreliable processors to agree on anything.”
That “anything” turns out to include every financial transaction you swipe, every multiplayer game frame you render, every node in your Kubernetes cluster and every line in your distributed ledger.
Let’s unpack PAXOS in plain language, see why it still matters 35 years later, and discover why it is (intentionally) more like a parliamentary procedure than a line of code.
Problem Statement: Keeping Replicas in Sync
Imagine 5 servers in different data centers all holding copies of the same bank account.
- Crash: one machine dies forever.
- Partition: network flips and some of them lose contact.
- Race: clients deposit $1, so three servers see the balance as $101, two see $100 — which is the truth?
We need one agreed-upon sequence of operations (a “log”) that survives $ f $ faults out of $ 2f+1 $ machines — PAXOS gives us that.
Cast of Characters
- Clients: want the system to pick (“commit”) a value (a deposit, a write, a command).
- Proposers: make formal proposals like “I propose the next deposit is $101.”
- Acceptors: a quorum of these actually vote on proposals.
- Learners: might not vote, but need to know the final decision (application threads).
The Core Procedure (Two-Phase Micro-Parliament)
PAXOS boils down to:
Phase 1. Prepare Phase – Electing a Speaker
Goal: Discover any previous unfinished business and lock the room.
- A proposer picks a monotonically increasing proposal number $ n $ (timestamp + id to break ties).
- Sends Prepare(n) to a majority quorum of acceptors.
- An acceptor promises two things if $ n $ is the highest number it has seen:
– Never accept any proposal numbered $ < n $.
– Return the latest proposal it has already accepted (with its value and number).
Phase 2. Accept Phase – Passing the Bill
Goal: Actually choose a value.
- If the proposer receives promises from the majority, it looks at the returned values: it must pick the value associated with the highest proposal number (if there was one). If nobody had anything, it may pick its own.
- Sends Accept(n,v) to the same quorum.
- Acceptors vote “yes” if they haven’t promised to a higher prepare. After a majority accept the same $ (n,v) v $** is officially chosen.
- Learners are notified (explicitly or by observing the quorum).
Why It Always Works
Safety – Two quorums always intersect; once a majority has accepted a value, every later proposal must continue that value.
Liveness – As long as:
- Eventually there is a majority available, and
- One proposer waits its turn (retrying with higher numbers),
…then some value will eventually be chosen. This does not guarantee fast termination, which is why real systems add leaders and timeouts (see: Multi-PAXOS, Raft).
The Full Algorithm: from Single Decision to Replicated Log
In practice we need many PAXOS instances, one per log slot (0,1,2…).
This variant is called Multi-PAXOS and is implemented in systems like Google Chubby (locks), Spanner (global SQL), and CosmosDB.
Sequence of operation:
- A stable leader acts as sole proposer for log slot $ i $.
- Once a slot is committed, all replicas apply the command to their state machines.
- The leader leases the right to propose, allowing batches of values and amortizing Phase 1.
The Bad News: Why Nobody Types Raw PAXOS Daily
- Hard to reason about corner cases.
- Must handle duplicate messages, message reorder, and Byzantine actors (the latter needs PBFT extensions).
- Leader changing is subtle.
That’s why most practitioners reach for Raft (same guarantees, simpler leader dance) or production-ready libraries like etcd.
Still Useful After 35 Years? Absolutely.
- Foundational Proof Machinery: Any new quorum consensus algorithm gets measured against the PAXOS safety proof; you want to inherit its rigor.
- Edge-cases Tested in Production: Chubby has run on tens of thousands of machines for >15 years; the path is paved.
- Performance Extensible: Modular leader leases, fast followers (Fast-PAXOS), Byzantine PAXOS, and elastic voting rings build directly on the skeleton.
Closing Cheatsheet
Concept | Plain English Translation |
---|---|
Quorum | Majority (e.g., 3 of 5) saying “yes” makes it official. |
Proposal number = ballot | Bigger numbers squash older ones (speaker’s gavel). |
Safety | Once something passes, any future bill must repeat that wording. |
Liveness | Keep pounding gavels until enough people vote. |
Multi-PAXOS | Run the parliament for every line of a multi-page log. |
TL;DR
PAXOS is the distilled essence of majority voting under asynchrony and failures.
Leave the Greek letters to the textbooks; remember that if you can convince a majority of the senate to log an operation, then—even if the roof caves in—the next senate must keep the same agenda.
That guarantee is worth carrying around, even if you prefer the slicker derivative called Raft when writing your next microservice.