Distributed Transactions: 2PC and 3PC
Achieving ACID guarantees across multiple independent databases is the "Holy Grail" of distributed systems. While Sagas are popular for microservices, the classic protocols Two-Phase Commit (2PC) and Three-Phase Commit (3PC) remain the foundation of atomic transaction management in databases and distributed storage.
1. Two-Phase Commit (2PC)
The 2PC protocol uses a central Coordinator to manage all participating nodes.
The Phases:
- Prepare Phase: The coordinator asks all participants: "Can you commit?" Each node logs the transaction and replies "Yes" or "No."
- Commit Phase: If every participant said "Yes," the coordinator tells everyone to "Commit." If any node says "No" or fails to reply, the coordinator tells everyone to "Abort."
The Problem: The Blocking Nature
- Synchronous Bottleneck: If the coordinator crashes after the prepare phase, participants remain locked, unable to commit or abort, waiting indefinitely.
- Performance: 2PC is notoriously slow because it holds database locks for a long time across multiple network round-trips.
2. Three-Phase Commit (3PC)
3PC was designed to fix the blocking issue by adding an intermediate phase.
The Phases:
- CanCommit: The coordinator asks "Can we commit?" (similar to Prepare).
- PreCommit: If everyone says "Yes," the coordinator tells everyone to "PreCommit." This signals that a commit will happen.
- DoCommit: The actual commit phase.
Why 3PC solves the blocking issue:
If the coordinator crashes after the PreCommit phase, the participants know that a commit was imminent and can safely proceed. It removes the uncertainty of the 2PC "prepared" state.
The Reality
While 3PC is non-blocking, it is still vulnerable to network partitions. If a partition occurs, participants might have conflicting information, potentially leading to inconsistent states. Because of this, it is rarely used in high-scale cloud systems.
3. Why are they rarely used in Microservices?
- Sync/Blocking: They require active, synchronous communication.
- Latency: Cross-region 2PC would be too slow to be useful.
- Locking: They force database locks to be held for the duration of the multi-phase handshake, destroying throughput in high-concurrency systems.
Summary
While 2PC and 3PC provide strong theoretical guarantees, their blocking nature makes them a poor fit for modern, high-scale microservices. They remain essential knowledge for understanding how database engines handle internal transactions, but for distributed application design, we almost always prefer Sagas or Eventual Consistency patterns.
