What is a Distributed System?
A distributed system is a collection of autonomous computers that appear to its users as a single coherent system. In simpler terms, it's many computers working together to solve a problem that is too big for one.
1. Real-World Analogy: The Restaurant
- Single Server (Monolith): One person is the cook, the waiter, and the cashier. If they get sick, the restaurant closes. If 100 people arrive at once, the system crashes.
- Distributed System: You have a team of chefs, a team of waiters, and multiple cashiers.
- If one waiter is busy, another takes the order (Scalability).
- If one chef is sick, the kitchen still runs (Availability).
- The team needs a way to communicate effectively (Networking/Messaging).
2. Core Architectural Goals
In any system design interview, you are optimizing for three things:
A. Scalability
The ability to handle an increasing amount of work by adding resources.
- Vertical: Getting a bigger server (more CPU/RAM).
- Horizontal: Adding more servers.
B. Availability
The percentage of time the system is operational. (e.g., 99.99% "Four Nines").
C. Reliability
The ability of a system to continue functioning even when components fail.
3. When to use / When NOT to use
- Use Distributed Systems when: You have millions of users, massive data volume, or require global availability.
- Do NOT use when: You are building an MVP with 100 users. A monolith is faster to develop, easier to deploy, and cheaper to run at small scale.
4. Common Interview Mistakes
- Over-engineering: Suggesting a global Kafka cluster for a simple internal tool.
- Ignoring Failures: Assuming the network is reliable or that servers never crash.
Final Takeaway
Distributed systems are complex. Only add complexity when the scale demands it.