System Design: Designing a Collaborative Editor (Google Docs)
Designing a collaborative editor like Google Docs or Notion is one of the most technically challenging system design problems. It requires synchronizing the state of a document across multiple users in real-time, even when they are editing the same sentence simultaneously.
1. Core Requirements
- Real-time Collaboration: Multiple users can edit the same document at the same time.
- Low Latency: Changes from one user should appear on others' screens in milliseconds.
- Consistency: Everyone must eventually see the same final document state.
- Offline Support: Users should be able to edit while offline and sync later.
2. The Conflict Problem
If User A and User B both have a document with "Hello" and they both type a character at the same time:
- User A types "!" after "o" -> "Hello!"
- User B types " " after "o" -> "Hello "
- A simple "Last Write Wins" would lose one of the characters. We need a way to merge these operations.
3. Solution A: Operational Transformation (OT)
OT is the classic solution used by Google Docs and Etherpad.
- The Concept: Instead of sending the full text, you send "Operations" (Insert, Delete, Retain).
- Transformation: When the server receives an operation that conflicts with another already processed operation, it "transforms" the new operation to account for the previous change.
- Pros: Mature, works well for text.
- Cons: Extremely complex to implement correctly (especially the "Transformation" logic); requires a central server to order operations.
4. Solution B: CRDT (Conflict-free Replicated Data Types)
CRDT is the modern approach used by Notion, Figma, and Automerge.
- The Concept: Data structures are designed so that any two nodes can merge their state without a central coordinator and always reach the same result.
- Fractional Indexing: Every character is assigned a unique, immutable ID (e.g., a number between 0 and 1). To insert between "A" (0.5) and "B" (0.6), you assign the new character "C" the ID 0.55.
- Pros: Decentralized (supports P2P/WebRTC); naturally handles offline-to-online sync; mathematically guaranteed consistency.
- Cons: High memory overhead (every character needs metadata).
5. High-Level Architecture
- WebSockets: To push operations to the server and other clients with low latency.
- Document Service: Manages the active editing sessions and the OT/CRDT logic.
- Relational DB (Postgres): Stores document metadata and permissions.
- BLOB Store (S3): Stores the full document snapshots for fast loading.
6. Real-time Awareness (Presence)
- Cursors: Showing where other users are typing.
- User List: Who is currently viewing the doc.
- Implementation: Use a fast in-memory store like Redis with a short TTL for cursor positions.
Summary
Building a collaborative editor is a battle of State Synchronization. While OT is efficient but complex, CRDT is the future of decentralized, offline-first collaboration. By mastering these two algorithms, you can build systems that make the world feel like it's working on a single shared screen.
