System DesignAdvancedarticle

System Design: Distributed Transactions (2PC and 3PC)

A deep dive into the classic protocols for distributed transactions: Two-Phase Commit (2PC) and Three-Phase Commit (3PC). Understanding their blocking nature, failure scenarios, and why modern systems prefer Sagas.

Sachin SarawgiApril 20, 20262 min read2 minute lesson

Distributed Transactions: 2PC and 3PC

Achieving ACID guarantees across multiple independent databases is the "Holy Grail" of distributed systems. While Sagas are popular for microservices, the classic protocols Two-Phase Commit (2PC) and Three-Phase Commit (3PC) remain the foundation of atomic transaction management in databases and distributed storage.

1. Two-Phase Commit (2PC)

The 2PC protocol uses a central Coordinator to manage all participating nodes.

The Phases:

  1. Prepare Phase: The coordinator asks all participants: "Can you commit?" Each node logs the transaction and replies "Yes" or "No."
  2. Commit Phase: If every participant said "Yes," the coordinator tells everyone to "Commit." If any node says "No" or fails to reply, the coordinator tells everyone to "Abort."

The Problem: The Blocking Nature

  • Synchronous Bottleneck: If the coordinator crashes after the prepare phase, participants remain locked, unable to commit or abort, waiting indefinitely.
  • Performance: 2PC is notoriously slow because it holds database locks for a long time across multiple network round-trips.

2. Three-Phase Commit (3PC)

3PC was designed to fix the blocking issue by adding an intermediate phase.

The Phases:

  1. CanCommit: The coordinator asks "Can we commit?" (similar to Prepare).
  2. PreCommit: If everyone says "Yes," the coordinator tells everyone to "PreCommit." This signals that a commit will happen.
  3. DoCommit: The actual commit phase.

Why 3PC solves the blocking issue:

If the coordinator crashes after the PreCommit phase, the participants know that a commit was imminent and can safely proceed. It removes the uncertainty of the 2PC "prepared" state.

The Reality

While 3PC is non-blocking, it is still vulnerable to network partitions. If a partition occurs, participants might have conflicting information, potentially leading to inconsistent states. Because of this, it is rarely used in high-scale cloud systems.

3. Why are they rarely used in Microservices?

  • Sync/Blocking: They require active, synchronous communication.
  • Latency: Cross-region 2PC would be too slow to be useful.
  • Locking: They force database locks to be held for the duration of the multi-phase handshake, destroying throughput in high-concurrency systems.

Summary

While 2PC and 3PC provide strong theoretical guarantees, their blocking nature makes them a poor fit for modern, high-scale microservices. They remain essential knowledge for understanding how database engines handle internal transactions, but for distributed application design, we almost always prefer Sagas or Eventual Consistency patterns.

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

System DesignAdvanced

System Design: Managing Distributed Transactions with the Saga Pattern

System Design: Managing Distributed Transactions with the Saga Pattern In a monolithic architecture, a single database transaction guarantees ACID properties across all operations. In a microservices architecture, a sing…

Apr 20, 20263 min read
Deep Dive
#system-design#microservices#saga-pattern
System DesignAdvanced

Distributed Transactions Part 2: The Blocking Trap

Part 2: The Blocking Trap Two-Phase Commit (2PC) is the most famous distributed transaction protocol, but it is rarely used in high-scale cloud environments. Why? Because it is blocking. 1. The Handshake 2PC involves a C…

Apr 20, 20262 min read
Deep DiveDistributed Transactions Mastery
#distributed-transactions#2pc#3pc
System DesignAdvanced

Speculative Retries: The Google Approach to Solving Tail Latency

Speculative Retries: Solving the P99 Tail In a large distributed system, the "tail latency" (P99.9) is often dominated by a single "slow" node. This is the Tail at Scale problem. No matter how much you optimize your code…

Apr 20, 20262 min read
Deep DiveDistributed Systems Mastery
#system-design#low-latency#p99
System DesignAdvanced

Graceful Degradation: Feature Shedding

Graceful Degradation Most distributed systems do not fail all at once. They degrade in layers: rising tail latency, thread pool saturation, cache misses, partial dependency outages, then total user-visible failure. Grace…

Apr 20, 20264 min read
Deep Dive
#reliability#system-design#resilience

More in System Design

Category-based suggestions if you want to stay in the same domain.