What is the CAP Theorem?
The CAP Theorem states that in the event of a Network Partition, a distributed system can only provide two out of the following three guarantees:
- Consistency: Every read receives the most recent write or an error.
- Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
- Partition Tolerance: The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.
1. The Senior Perspective: CP vs AP
In the real world, you must choose Partition Tolerance. Networks fail. Thus, the choice is always between CP and AP:
- CP (Consistency + Partition Tolerance): When a network breaks, the system stops accepting writes to ensure data remains identical everywhere.
- Example: Zookeeper, HBase, MongoDB (default).
- Best For: Financial transactions, metadata management.
- AP (Availability + Partition Tolerance): When a network breaks, nodes keep accepting writes. Data might temporarily diverge, but the system stays up.
- Example: Cassandra, DynamoDB, CouchDB.
- Best For: Social media likes, shopping carts, metrics.
2. PACELC: The Modern Extension
CAP only describes what happens during a failure. PACELC adds what happens during normal operation:
If there is a Partition, choose between Availability or Consistency; Else (no partition), choose between Latency or Consistency.
3. Consistency Models
- Strong Consistency: User sees the update immediately (Slow).
- Eventual Consistency: User sees the update after some time (Fast).
- Read-after-write: User sees their own update immediately, but others might not.
Final Takeaway
There is no "perfect" system. Every design is a choice of which failure mode you prefer.