The #1 Ben Levi Blog On The Internet

Coordination Is Hard


“Consensus,” said Leslie Lamport when he introduced the PAXOS algorithm in 1989, “is the problem of getting a group of unreliable processors to agree on anything.”
That “anything” turns out to include every financial transaction you swipe, every multiplayer game frame you render, every node in your Kubernetes cluster and every line in your distributed ledger.
Let’s unpack PAXOS in plain language, see why it still matters 35 years later, and discover why it is (intentionally) more like a parliamentary procedure than a line of code.


Problem Statement: Keeping Replicas in Sync

Imagine 5 servers in different data centers all holding copies of the same bank account.

We need one agreed-upon sequence of operations (a “log”) that survives $ f $ faults out of $ 2f+1 $ machines — PAXOS gives us that.


Cast of Characters


The Core Procedure (Two-Phase Micro-Parliament)

PAXOS boils down to:

Phase 1. Prepare Phase – Electing a Speaker

Goal: Discover any previous unfinished business and lock the room.

  1. A proposer picks a monotonically increasing proposal number $ n $ (timestamp + id to break ties).
  2. Sends Prepare(n) to a majority quorum of acceptors.
  3. An acceptor promises two things if $ n $ is the highest number it has seen:
    – Never accept any proposal numbered $ < n $.
    – Return the latest proposal it has already accepted (with its value and number).

Phase 2. Accept Phase – Passing the Bill

Goal: Actually choose a value.

  1. If the proposer receives promises from the majority, it looks at the returned values: it must pick the value associated with the highest proposal number (if there was one). If nobody had anything, it may pick its own.
  2. Sends Accept(n,v) to the same quorum.
  3. Acceptors vote “yes” if they haven’t promised to a higher prepare. After a majority accept the same $ (n,v) ,** v $** is officially chosen.
  4. Learners are notified (explicitly or by observing the quorum).

Why It Always Works

Safety – Two quorums always intersect; once a majority has accepted a value, every later proposal must continue that value.

Liveness – As long as:

…then some value will eventually be chosen. This does not guarantee fast termination, which is why real systems add leaders and timeouts (see: Multi-PAXOS, Raft).


The Full Algorithm: from Single Decision to Replicated Log

In practice we need many PAXOS instances, one per log slot (0,1,2…).
This variant is called Multi-PAXOS and is implemented in systems like Google Chubby (locks), Spanner (global SQL), and CosmosDB.

Sequence of operation:


The Bad News: Why Nobody Types Raw PAXOS Daily

That’s why most practitioners reach for Raft (same guarantees, simpler leader dance) or production-ready libraries like etcd.


Still Useful After 35 Years? Absolutely.

  1. Foundational Proof Machinery: Any new quorum consensus algorithm gets measured against the PAXOS safety proof; you want to inherit its rigor.
  2. Edge-cases Tested in Production: Chubby has run on tens of thousands of machines for >15 years; the path is paved.
  3. Performance Extensible: Modular leader leases, fast followers (Fast-PAXOS), Byzantine PAXOS, and elastic voting rings build directly on the skeleton.

Closing Cheatsheet

Concept Plain English Translation
Quorum Majority (e.g., 3 of 5) saying “yes” makes it official.
Proposal number = ballot Bigger numbers squash older ones (speaker’s gavel).
Safety Once something passes, any future bill must repeat that wording.
Liveness Keep pounding gavels until enough people vote.
Multi-PAXOS Run the parliament for every line of a multi-page log.

TL;DR

PAXOS is the distilled essence of majority voting under asynchrony and failures.
Leave the Greek letters to the textbooks; remember that if you can convince a majority of the senate to log an operation, then—even if the roof caves in—the next senate must keep the same agenda.
That guarantee is worth carrying around, even if you prefer the slicker derivative called Raft when writing your next microservice.