System Design: Data Partitioning and Sharding

When your database outgrows a single machine, you have two choices: Scale Up (bigger machine) or Scale Out (multiple machines). Sharding (Horizontal Partitioning) is the art of scaling out by splitting your data across multiple servers.

1. Sharding vs. Partitioning

Vertical Partitioning (Normalization): Splitting tables by columns (e.g., storing user metadata in one table and user preferences in another).
Horizontal Partitioning (Sharding): Splitting the table by rows. Server 1 holds user IDs 1-1M, Server 2 holds 1M-2M.

2. Choosing the Right Shard Key

The shard key is the most important architectural decision. If you get it wrong, you end up with Hotspots.

Hash-based Sharding: .
- Pros: Even distribution of data.
- Cons: Hard to perform range queries (e.g., "Find all users joined between X and Y") because data is scattered everywhere.
Range-based Sharding: for IDs 0-1000, for 1001-2000.
- Pros: Efficient range queries.
- Cons: Leads to "hot shards" if one range becomes popular (e.g., all new users being added to the newest shard).

3. Directory-Based Sharding

Maintain a "Shard Map" (a lookup table) that tells the application where each key lives.

Pros: Extremely flexible; you can move users between shards without changing the key.
Cons: The lookup table itself becomes a single point of failure and a performance bottleneck.

4. The Challenge: Re-Sharding

What happens when your 10 shards are full and you need to add 2 more?

If you use , changing it to forces you to move almost 100% of your data.
The Solution: Use Consistent Hashing to ensure that only of the data needs to be moved when the cluster size changes.

5. Summary

Sharding is a "last resort" for scaling. It adds significant complexity to queries (cross-shard joins are nearly impossible). Before sharding, always exhaust other options: better indexing, caching (Redis), and read replicas. But when you finally do shard, choose your key based on your most frequent access pattern.

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

System Design: Data Partitioning and Sharding Strategies

System Design: Data Partitioning and Sharding

1. Sharding vs. Partitioning

2. Choosing the Right Shard Key

3. Directory-Based Sharding

4. The Challenge: Re-Sharding

5. Summary

Sachin Sarawgi

Keep Learning

System Design: Designing a Database Proxy for Sharding (Vitess Style)

System Design: Designing a Content Moderation System (Meta/TikTok Scale)

Related Articles

System Design: Designing Instagram (Photo Sharing at Scale)

System Design: Designing a Database Proxy for Sharding (Vitess Style)

System Design: Designing an Ad Click Aggregator

System Design: Designing Airbnb (Hotel/Home Booking)

More in System Design

System Design: Designing Stateless Authentication

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture

System Design: Data Partitioning and Sharding Strategies

System Design: Data Partitioning and Sharding

1. Sharding vs. Partitioning

2. Choosing the Right Shard Key

3. Directory-Based Sharding

4. The Challenge: Re-Sharding

5. Summary

Get the next backend guide in your inbox

Sachin Sarawgi

Keep Learning

System Design: Designing a Database Proxy for Sharding (Vitess Style)

System Design: Designing a Content Moderation System (Meta/TikTok Scale)

Related Articles

System Design: Designing Instagram (Photo Sharing at Scale)

System Design: Designing a Database Proxy for Sharding (Vitess Style)

System Design: Designing an Ad Click Aggregator

System Design: Designing Airbnb (Hotel/Home Booking)

More in System Design

System Design: Designing Stateless Authentication

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture