System DesignAdvancedcase study

System Design: Designing Twitter (Timeline and News Feed)

A deep dive into the architecture of Twitter. Learn how to handle millions of tweets per second using Fan-out on Write (Push) vs. Fan-out on Read (Pull) models.

Sachin SarawgiApril 20, 20263 min read3 minute lesson

Key Takeaways

What to remember from this case study

Tweet Publishing: A user can post a new tweet.

Recommended Prerequisites
System Design: Designing a Distributed Task Scheduler

System Design: Designing Twitter (Timeline and News Feed)

Twitter (now X) is a massive real-time messaging system. The core technical challenge is not storing the tweets, but delivering them to millions of followers' timelines with sub-second latency.

1. Core Requirements

  • Tweet Publishing: A user can post a new tweet.
  • Timeline (Feed): A user can see tweets from people they follow.
  • High Availability: The system must be always available.
  • Scalability: Handling millions of users and high-profile "Celebrity" accounts.

2. The Fan-out Challenge

"Fan-out" is the process of delivering a single tweet to all the followers of the author.

Option A: Fan-out on Read (The Pull Model)

When a user opens their timeline, the system searches for all people they follow, fetches their latest tweets, and sorts them by time.

  • Pros: Fast writes.
  • Cons: Slow reads. Doing a join across thousands of authors for every timeline refresh is extremely expensive for the database.

Option B: Fan-out on Write (The Push Model)

When a user posts a tweet, the system immediately pushes a reference to that tweet into the "Timeline Cache" (usually in Redis) of every follower.

  • Pros: Blazing fast reads. The user's timeline is already pre-computed in Redis.
  • Cons: Slow writes. If a celebrity with 50 million followers tweets, the system must perform 50 million Redis writes immediately.

3. The Hybrid Solution: Handling Celebrities

Twitter uses a hybrid approach to balance these trade-offs:

  1. Regular Users: Use Fan-out on Write (Push). Their tweets are pushed to their followers' caches immediately.
  2. Celebrities (High Follower Count): Use Fan-out on Read (Pull). Their tweets are NOT pushed to millions of caches. Instead, when a follower of a celebrity views their timeline, the celebrity's tweets are merged into the timeline on-the-fly.

4. Storage & Caching

  • Tweet Store: Cassandra or a similar wide-column store is ideal for storing tweets indexed by user_id and timestamp.
  • Timeline Cache: Redis stores the list of tweet IDs for each user's feed.
  • Media Store: Amazon S3 for images and videos, served via a CDN (CloudFront).
  • Search: Use Elasticsearch or a custom inverted index to handle hashtag and keyword searches.
  • Trends: Use Apache Storm or Flink for real-time stream processing to identify "Trending" topics based on tweet frequency.

Summary

The secret to Twitter's scale is Pre-computation. By pre-calculating timelines for 99% of users and only using the "Pull" model for high-follower accounts, Twitter maintains the responsiveness that makes it a real-time platform.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon
Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon
Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

System DesignAdvanced

System Design: Designing a Global Distributed Rate Limiter

System Design Masterclass: Designing a Distributed Rate Limiter In a distributed environment, a single malicious script, a misconfigured client, or a massive traffic spike can easily overwhelm your backend servers, bring…

Apr 20, 20266 min read
Case StudyBackend Systems Mastery
#system-design#rate-limiting#redis
System DesignAdvanced

System Design: Designing a Real-time Gaming Leaderboard (Massive Scale)

System Design: Designing a Real-time Gaming Leaderboard A leaderboard is a core feature in competitive gaming and fitness apps. While a simple SQL ORDER BY works for 100 users, doing it for 100 million users with real-ti…

Apr 20, 20263 min read
Deep Dive
#system-design#gaming#leaderboard
System DesignIntermediate

System Design: Designing a Content Delivery Network (CDN)

System Design: Designing a Content Delivery Network (CDN) A CDN is a geographically distributed group of servers that work together to provide fast delivery of internet content. By caching assets (images, videos, JS/CSS)…

Apr 20, 20263 min read
Deep Dive
#system-design#cdn#caching
System DesignAdvanced

System Design: Designing a Distributed Task Scheduler

System Design Masterclass: Designing a Distributed Task Scheduler Every backend engineer has written a cron job. It's simple: you put a script on a Linux server and tell the OS to run it every night at midnight. But what…

Apr 20, 20266 min read
Case StudyBackend Systems Mastery
#system-design#task-scheduler#cron

More in System Design

Category-based suggestions if you want to stay in the same domain.