System Design: Multi-Leader Replication
In a single-leader setup, all writes go to one node. This is a bottleneck for global applications. Multi-Leader Replication allows writes to happen at multiple data centers simultaneously, dramatically improving latency and availability.
1. Why Multi-Leader?
- Geo-Latency: Users in London write to the London datacenter; users in NYC write to the NYC datacenter.
- Resilience: If one datacenter goes down, others can still accept writes.
- Scalability: Horizontal write scaling.
2. The Conflict Challenge
Since writes happen in parallel, two users can update the same row simultaneously in different datacenters.
- Conflict Resolution Strategies:
- Last Write Wins (LWW): Compare timestamps and keep the latest write. Simple but prone to clock-skew issues.
- Conflict-free Replicated Data Types (CRDTs): Data structures designed to be merged deterministically.
- Version Vectors: Each node maintains a vector of version history to detect causality.
3. Replication Topologies
- All-to-All: Every node replicates to every other node.
- Circular: Writes pass through a fixed ring of nodes.
- Star: Writes go to a central node that redistributes to others (less resilient).
4. Conflict Avoidance
The simplest way to handle multi-leader conflicts? Avoid them. If a user's data is partitioned so that a specific user always hits the same datacenter (using Geohashing or UserID-based partitioning), you eliminate the conflict entirely.
Summary
Multi-leader replication is a powerful tool for global scale, but it demands robust conflict resolution. By partitioning traffic to avoid collisions whenever possible, you keep your system simple while gaining the massive availability benefits of a multi-leader design.
