DatabasesIntermediateguidePart 2 of 8 in Database Sharding Mastery

Database Sharding Part 2: Partitioning vs. Sharding

Learn the difference between logical table partitioning and physical database sharding. Which one do you need?

Sachin Sarawgi•April 20, 2026•5 min read•5 minute lesson

#database-scaling #partitioning #sharding #postgresql

On This PageOpen

1. Logical Table Partitioning (Vertical/Horizontal Slicing on a Single Node)
The B-Tree Index Problem
How Partitioning Solves This
2. Physical Database Sharding (Horizontal Scaling Across Nodes)
The Problem Sharding Solves
The Massive Trade-off
Summary: Which one do you need?

Recommended Prerequisites

Database Indexing Deep Dive

Database Sharding Part 2: Partitioning vs. Sharding

When engineering teams realize their database is struggling to keep up with load, the words "partitioning" and "sharding" are often thrown around interchangeably in architectural meetings.

However, they are fundamentally different concepts operating at entirely different layers of the software stack. One solves index bloating and maintenance issues; the other solves hard hardware bottlenecks.

If you jump straight to sharding when you only needed partitioning, you will introduce months of unnecessary distributed systems complexity. Let's explore the critical differences between the two.

1. Logical Table Partitioning (Vertical/Horizontal Slicing on a Single Node)

Table partitioning is a database-native feature where a single massive logical table (e.g., events) is physically broken down into smaller, individual tables under the hood.

Importantly, all of these smaller tables exist on the exact same database server.

When your application executes a SELECT * FROM events WHERE created_at = '2026-01-01', the database query planner intercepts the query, realizes it only needs data from the events_2026_01 partition, and skips scanning all the other partitions.

The B-Tree Index Problem

Why do we partition tables? To save the B-Tree index.

In PostgreSQL or MySQL, an index is structured as a B-Tree. When a table has 5 million rows, the B-Tree is shallow, and traversing it to find a row takes sub-milliseconds. When a table has 5 billion rows, the B-Tree becomes incredibly deep and massive. It no longer fits in the server's RAM (buffer cache).

Every time you insert a new row into a 5-billion-row table, the database has to update this massive B-Tree. If the required parts of the index aren't in RAM, the database has to read from the disk (an I/O miss), dramatically slowing down write performance.

How Partitioning Solves This

By using Range Partitioning (e.g., creating a new partition every month), you are creating a fresh, tiny B-Tree index every month.

The current month's index is small enough to stay 100% in RAM.
INSERT performance remains lightning fast forever.
Old partitions can easily be archived or dropped instantly using DROP TABLE events_2024_01, which is an O(1) metadata operation, rather than running a massive DELETE query that takes hours and causes table bloat.

Partitioning does NOT scale hardware limits

Because all partitions live on the same RDS instance, partitioning does absolutely nothing to help you if you are maxing out your CPU cores, or if your overall disk IOPS are saturated. It purely solves logical data management and index traversal speed.

2. Physical Database Sharding (Horizontal Scaling Across Nodes)

If partitioning is slicing a pie into 12 pieces but leaving them on the same plate, Sharding is buying 12 different plates and putting a slice on each one.

Sharding (often called Horizontal Partitioning) is the architectural process of distributing your dataset across multiple, independent physical database servers.

Server A handles users with IDs 1 to 1,000,000.
Server B handles users with IDs 1,000,001 to 2,000,000.

The Problem Sharding Solves

Sharding solves the Vertical Ceiling (discussed in Part 1). When a single 128-core AWS instance cannot physically handle the CPU or I/O demands of your traffic, sharding allows you to add infinite capacity. If you need more CPU, you just spin up another physical database shard and migrate a chunk of your users to it.

The Massive Trade-off

Sharding forces you to embrace distributed systems complexity:

No Cross-Shard Joins: You cannot write a SQL JOIN between a user on Shard A and an order on Shard B. The application layer must fetch the data from both shards and join it in application memory.
No Distributed Transactions: If you need to deduct money from a user on Shard A and credit a user on Shard B, you can no longer rely on BEGIN and COMMIT. You must implement complex distributed transaction patterns like Two-Phase Commit (2PC) or the Saga Pattern.
Operational Overhead: Instead of monitoring, backing up, and maintaining one massive database, your DevOps team must now manage 10, 20, or 100 separate databases.

The Shard Routing Layer

The database engine usually does not know it is sharded. Your application code (or a proxy like Vitess or PgBouncer) must intercept every query, analyze the Shard Key, determine which physical database holds the data, and route the connection accordingly. This requires significant code rewrites.

Summary: Which one do you need?

Feature	Logical Partitioning	Physical Sharding
Location	Single Server	Multiple Servers
Solves	Huge Indexes, Slow Deletes, Archiving	Maxed out CPU, RAM, Disk I/O
SQL Joins	Works perfectly across partitions	Broken across shards
ACID Transactions	Fully Supported	Requires application-level sagas
App Complexity	Zero (handled by DB)	High (requires smart routing)

The Golden Rule: Always logically partition your massive tables first. Only when you max out the physical limits of the largest cloud instance should you resort to physical sharding.

Learning Path: System Design Roadmap

Keep the momentum going

Step 3 of 10: Your next milestone in this track.

NEXT UP

Database Sharding Part 3: The Shard Key Blueprint

5 min read • Advanced

Learning Path: Databases Track

Keep the momentum going

Step 9 of 54: Your next milestone in this track.

NEXT UP

Database Sharding Part 3: The Shard Key Blueprint

5 min read • Advanced

📚

Recommended Resources

Designing Data-Intensive ApplicationsEssential

The go-to book for understanding databases, consistency, and distributed data.

View on Amazon →

MongoDB — The Complete Developer's Guide — Udemy

Comprehensive MongoDB course from basics to advanced aggregations.

View Course →

Practical engineering notes

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

LinkedIn GitHub Medium More articles

Share this lesson

Share on X Share on LinkedIn

Continue Series

Database Sharding Mastery

Lesson 2 of 8 in this learning sequence.

Next in series

Intermediate

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 1: The Vertical Ceiling In the early days of a startup, database scaling is a solved problem. You go into the AWS RDS or GCP Cloud SQL console, select a larger instance type, click "Modify," and yo…

Intermediate

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 2: Partitioning vs. Sharding When engineering teams realize their database is struggling to keep up with load, the words "partitioning" and "sharding" are often thrown around interchangeably in arc…

Advanced

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 3: The Shard Key Blueprint Once you have accepted the architectural complexity of sharding (as discussed in Part 2), you are faced with the single most critical decision in distributed database des…

Advanced

Database Sharding Part 4: Consistent Hashing Internals

Database Sharding Part 4: Consistent Hashing Internals In Part 3, we successfully identified userid as our shard key. Now, we must write the mathematical algorithm that routes an incoming query containing userid: 1045 to…

Advanced

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 5: The Scatter-Gather Problem When you implement a sharded database architecture, you establish a primary Shard Key (as discussed in Part 3) to route your queries. If your Shard Key is userid, retr…

Advanced

Database Sharding Part 6: Zero-Downtime Re-sharding

Database Sharding Part 6: Zero-Downtime Re-sharding Eventually, your sharding strategy will need to change. Perhaps you outgrew your initial 4-shard cluster and need to expand to 16 shards. Or perhaps you realized your S…

Advanced

Database Sharding Part 7: Case Study - Scaling Discord to Billions

Database Sharding Part 7: Case Study - Scaling Discord to Billions We have spent the last 6 parts of this series exploring the theoretical mathematics, routing algorithms, and migration playbooks for distributed database…

Expert

Query Optimization: The Hidden Cost of Cross-Shard Joins

Query Optimization: The Hidden Cost of Cross-Shard Joins In a sharded database, the "Scatter-Gather" query is the silent performance killer. When you perform a join on columns that aren't the shard key, your proxy has to…

Keep Learning

Move through the archive without losing the thread.

Database Sharding Part 3: The Shard Key Blueprint

Databases5 min readAdvanced

Database Sharding Part 1: The Vertical Ceiling

Databases5 min readIntermediate

More deep dives chosen from shared tags, category overlap, and reading difficulty.

DatabasesIntermediate

More in Databases

Category-based suggestions if you want to stay in the same domain.

DatabasesBeginner

Bloom Filters: The Speed Secret of Modern NoSQL Databases

Bloom Filters: Avoiding the Disk Bottleneck In high-performance databases like Cassandra, RocksDB, and BigTable, the biggest performance killer is unnecessary disk I/O. When you query for a key that doesn't exist, the da…

Apr 20, 20262 min read

Deep Dive

#databases#bloom-filters#nosql

DatabasesAdvanced

Cassandra Gotchas: Dealing with Tombstones and Wide Partitions

Cassandra Gotchas: Managing Distributed Scale Cassandra is built for extreme availability, but its "append-only" storage model (LSM-trees) introduces specific behaviors that can catch developers off guard. Here are the m…

Apr 20, 20263 min read

Deep Dive

#cassandra#databases#performance

DatabasesAdvanced

Cassandra Internals: LSM-Trees, Gossip, and Eventual Consistency

Cassandra Internals: Built for Scale Apache Cassandra is a peer-to-peer distributed database designed to handle massive amounts of data across many commodity servers. Its "Masterless" architecture and high write throughp…

Apr 20, 20262 min read

Deep Dive

#cassandra#databases#lsm-trees

← Back to all articles

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 2: Partitioning vs. Sharding

1. Logical Table Partitioning (Vertical/Horizontal Slicing on a Single Node)

The B-Tree Index Problem

How Partitioning Solves This

2. Physical Database Sharding (Horizontal Scaling Across Nodes)

The Problem Sharding Solves

The Massive Trade-off

Summary: Which one do you need?

Keep the momentum going

Keep the momentum going

Recommended Resources

Get the next backend guide in your inbox

Sachin Sarawgi

Database Sharding Mastery

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 4: Consistent Hashing Internals

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 6: Zero-Downtime Re-sharding

Database Sharding Part 7: Case Study - Scaling Discord to Billions

Query Optimization: The Hidden Cost of Cross-Shard Joins

Keep Learning

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 1: The Vertical Ceiling

Related Articles

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 6: Zero-Downtime Re-sharding

More in Databases

Bloom Filters: The Speed Secret of Modern NoSQL Databases

Cassandra Gotchas: Dealing with Tombstones and Wide Partitions

Cassandra Internals: LSM-Trees, Gossip, and Eventual Consistency