How Database Replication Works

What is Replication?

Replication is the process of copying and maintaining database data across multiple servers (nodes). Instead of storing data on a single machine, replication distributes copies to several nodes so the system can survive failures and serve more users.

Why do we need it?

Problem	How replication helps
Hardware failure	If one node crashes, another has a copy — the system stays up
Read scalability	Distribute read queries across replicas instead of overloading one server
Latency	Place replicas geographically close to users — faster response times
Maintenance	Take one node down for upgrades without downtime

Key terminology

Primary / Leader / Master — the node that accepts write operations
Replica / Follower / Slave — a node that receives copies of data from the primary
Write-ahead log (WAL) — the internal log the primary ships to replicas so they can replay changes
Replication lag — the time delay between a write on the primary and when the replica applies it
Failover — the process of promoting a replica to primary when the original primary fails
Write concern — how many nodes must confirm a write before the client gets an ACK (e.g., w: 1 = primary only, w: majority = most replicas)

How replication works (simplified)

Client sends WRITE to Primary
Primary writes to its local storage + WAL
Primary ships the WAL entry to Replicas
Replicas replay the WAL entry → data is now copied
Replicas send ACK back (timing depends on sync/async mode)
Client receives confirmation

The critical question is: when does the client get the ACK? That's what determines whether replication is synchronous or asynchronous.

Visualization

See how a single write flows through different replication topologies. Toggle between modes to compare write latency, consistency guarantees, and failure behavior:

Primary handles all writes, waits for all replicas to confirm before ACKing the client. Strong consistency but higher write latency.

0 / 0

💻Client

│

PrimaryWrites

—

│

Replica AReads

—

Replica BReads

—

Replication Strategies Compared

Aspect	Single-Leader (Sync)	Single-Leader (Async)	Multi-Leader
Write path	Client → Primary → wait for all replicas → ACK	Client → Primary → ACK immediately	Client → any Leader → replicate to other Leaders
Read path	Any replica (always up-to-date)	Any replica (may be stale)	Any replica in any region
Consistency	Strong — all nodes agree before ACK	Eventual — replicas catch up later	Eventual — conflict resolution needed
Write latency	High (waits for all replicas)	Low (ACK immediately)	Low (local leader ACK)
Read scalability	Good — replicas serve reads	Good — replicas serve reads	Excellent — reads spread across regions
Failure tolerance	If primary fails, a replica takes over	Replicas may lose recent writes	Other leaders continue accepting writes
Use case	Financial systems, user accounts	Content delivery, analytics dashboards	Geo-distributed apps, offline-first

When Replication Lag Matters

In asynchronous replication, there's a window where replicas are behind the primary. This causes:

Stale reads — a user writes a comment, refreshes, and doesn't see it because they hit a stale replica
Read-after-write inconsistency — the user who wrote the data expects to read it back immediately
Data loss on failover — if the primary crashes before replicating, committed writes are lost

Common mitigations:

Read-your-writes consistency — route the writing user's subsequent reads to the primary
Monotonic reads — ensure a user always reads from the same replica (session stickiness)
Semi-synchronous replication — wait for at least one replica to confirm, then ACK

Single-Leader vs Multi-Leader

Single-Leader

The simplest model. One node accepts all writes; replicas only serve reads.

Easy to reason about — no write conflicts
Primary is a bottleneck for write-heavy workloads
Failover requires promoting a replica to primary (automatic in managed databases)

Multi-Leader

Multiple nodes accept writes independently. Changes are synchronized between leaders.

Writes scale horizontally — each region has its own leader
Requires conflict resolution when two leaders accept concurrent writes to the same row
Common conflict strategies: last-write-wins (LWW), application-level merge, CRDTs

Replication in Practice

PostgreSQL: Streaming replication (async by default, sync available). Uses WAL (Write-Ahead Log) shipping.
MySQL: Binary log replication with async, semi-sync, and group replication modes.
MongoDB: Replica sets with automatic failover. Supports adjustable write concern (w: 1, w: majority).
Redis: Asynchronous replication. Sentinel for automatic failover.

Learn More

Read the full theory in Database > Replication.

What is Replication?​

Why do we need it?​

Key terminology​

How replication works (simplified)​

Visualization​

Replication Strategies Compared​

When Replication Lag Matters​

Single-Leader vs Multi-Leader​

Single-Leader​

Multi-Leader​

Replication in Practice​

Learn More​