System DesignAdvancedarticle

System Design: Designing a Video Conferencing System (Zoom / MS Teams)

How does Zoom handle 1,000 participants in a single call with low latency? A technical deep dive into WebRTC, SFU vs. MCU, and UDP vs. TCP.

Sachin SarawgiApril 20, 20263 min read3 minute lesson

System Design: Designing a Video Conferencing System

Designing a real-time video conferencing system like Zoom or Microsoft Teams is fundamentally different from a video streaming service like YouTube. While YouTube prioritizes quality and high resolution, Zoom prioritizes Latency. A delay of more than 150ms makes a conversation impossible.

1. Core Requirements

  • Real-time Video/Audio: Bi-directional streaming with sub-200ms latency.
  • Large Meetings: Supporting hundreds or thousands of participants.
  • Screen Sharing: Sharing a high-resolution, low-framerate stream.
  • Resilience: Handling varying network conditions (packet loss, low bandwidth).

2. The Protocol: UDP vs. TCP

  • The Choice: UDP (User Datagram Protocol) is the mandatory choice for real-time media.
  • Why? TCP's error correction (re-sending lost packets) causes delay. In a call, it's better to lose a single frame of video (a minor glitch) than to pause the whole call to wait for that frame to arrive.

3. Communication Technology: WebRTC

WebRTC is the standard for real-time communication in the browser. It handles:

  • STUN/TURN Servers: For bypassing firewalls and finding the best path between peers.
  • Signaling: Using WebSockets to exchange metadata (like "I'm calling you") before the media starts flowing.

4. Scaling the Meeting: SFU vs. MCU

How do you deliver 100 video streams to 100 participants?

Option A: Peer-to-Peer (Mesh)

Every user sends their stream to every other user.

  • Limit: Only works for 2-3 people. A user's upload bandwidth will crash with more.

Option B: MCU (Multipoint Control Unit)

The server receives all streams, mixes them into one single video (like a collage), and sends that one stream to everyone.

  • Pros: Low bandwidth for the client.
  • Cons: Extremely CPU-intensive for the server.

Option C: SFU (Selective Forwarding Unit) - The Standard

The server receives all streams but doesn't mix them. It simply forwards the relevant streams to each participant.

  • The Optimization: If a participant is muted and their camera is off, the SFU stops forwarding their data. This is how Zoom scales to 1,000 people.

5. Handling Network Jitter (Adaptive Bitrate)

  • Simulcast: The client sends three versions of their video (High, Medium, Low quality) to the SFU. The SFU forwards the High-quality version to users with fast internet and the Low-quality version to users with slow mobile data.

6. Global Scalability

Video servers must be placed in data centers geographically close to participants to minimize the "Speed of Light" delay.

  • Geo-routing: If users in London are talking, the meeting should be hosted on a server in London, not New York.

Summary

The engineering of video conferencing is a masterclass in Low-latency Networking. By leveraging UDP, SFU architectures, and Simulcast for adaptive quality, you can build a platform that makes global communication feel as natural as a face-to-face meeting.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon
Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon
Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

More in System Design

Category-based suggestions if you want to stay in the same domain.