DatabasesIntermediateguidePart 1 of 8 in Database Sharding Mastery

Database Sharding Part 1: The Vertical Ceiling

When should you stop buying bigger servers? Learn to identify the physical limits of vertical scaling and why sharding is your last resort.

Sachin SarawgiApril 20, 20265 min read5 minute lesson

Database Sharding Part 1: The Vertical Ceiling

In the early days of a startup, database scaling is a solved problem. You go into the AWS RDS or GCP Cloud SQL console, select a larger instance type, click "Modify," and your application is suddenly fast again.

This process—scaling by adding more CPU, RAM, and Disk I/O to a single machine—is known as Vertical Scaling.

Vertical scaling is glorious. It requires zero application code changes, zero migration downtime (with modern failover), and preserves the beautiful, monolithic ACID guarantees of your relational database. But eventually, every hyper-growth application hits a physical limit. We call this the Vertical Ceiling.

In Part 1 of this series, we will dissect the physical, architectural, and operational ceilings of single-node databases, and define exactly when you must abandon vertical scaling and embrace sharding.

The Physical Limits of Hardware

You might think, "I'll just keep buying bigger servers." But the cloud providers have limits.

As of this writing, if you attempt to scale a PostgreSQL database on AWS, the largest instance you can provision (like a db.x2iedn.32xlarge) tops out at 128 vCPUs and 4,096 GiB of RAM.

While 4TB of RAM sounds virtually infinite for a startup, for an application processing millions of events per second, it is remarkably small. More importantly, simply adding cores and RAM yields diminishing returns due to NUMA (Non-Uniform Memory Access) architecture.

The NUMA Penalty

On massive servers, CPUs are divided into "nodes." Each CPU has local memory. If CPU 1 needs to read memory attached to CPU 4, the request must traverse the interconnect bus. As you scale vertically to 128 cores, the overhead of memory cross-talk increases exponentially. Doubling the cores rarely doubles the database throughput; it often only increases it by 30-40%.

The I/O Bottleneck

Even if you have infinite CPU, you do not have infinite disk speed.

Relational databases use Write-Ahead Logging (WAL). Every time you INSERT or UPDATE a row, the database must write that transaction to the WAL on disk before acknowledging the commit to the user. This is what guarantees durability.

If you are using network-attached storage like AWS EBS, you are fundamentally constrained by the network. AWS io2 Block Express volumes max out at 256,000 IOPS and 4,000 MB/s throughput.

When your application is ingesting 300,000 telemetry events per second, your EBS volume will physically throttle the database. At this point, the Vertical Ceiling is hard-capped by the physics of fiber-optic network cables in the AWS datacenter.

The Operational Nightmare

Let's assume you somehow bypass the CPU and I/O limits. You now have a 30 Terabyte PostgreSQL database sitting on a single monolithic instance. You have just entered the operational nightmare phase.

1. The Vacuuming Problem

PostgreSQL uses MVCC (Multi-Version Concurrency Control). When you update a row, it creates a new version and marks the old one as dead. A background process called autovacuum must clean up these dead rows. On a 30TB database with high write churn, the vacuum process can take days to run, eating up disk I/O and causing severe table bloat.

2. The Index Rebuild

Imagine needing to create a new index on a users table with 5 billion rows. Even using CREATE INDEX CONCURRENTLY, this operation will take weeks. If the index creation fails halfway through, you have to start over. Schema migrations become a high-stakes, multi-week anxiety attack.

3. The Backup and Restore Window

If your single massive database gets corrupted, you must restore it from a snapshot. Restoring a 30TB volume and rolling forward the transaction logs can easily result in 12+ hours of complete production downtime.

The Blast Radius

A monolithic database means a monolithic blast radius. If one bad analytical query accidentally bypasses an index and performs a full table scan on 5 billion rows, it will evict the entire working set from RAM, spiking CPU to 100%, and taking down the entire application for all users globally.

The Concurrency Bottleneck (Latch Contention)

Finally, there is internal database contention. When 10,000 active connections attempt to update different rows that happen to live on the same physical 8KB memory page, the database must lock that memory page. This is known as latch contention.

No amount of CPU or RAM can solve this. The database engine itself becomes the bottleneck because thousands of threads are waiting in line to modify the same memory block.

When is Sharding the Answer?

Sharding is the process of splitting your monolithic database into multiple, independent databases (shards), each responsible for a subset of the data.

Because sharding breaks relational constraints (foreign keys, global secondary indexes, and global transactions) and vastly increases application complexity, it must be your absolute last resort.

Before you shard, you must exhaust these alternatives:

  1. Vertical Scaling: Max out the RDS instance size until the ROI diminishes.
  2. Read Replicas: Offload all SELECT queries to 5-10 read replicas.
  3. Caching: Put Redis or Memcached in front of the database to absorb 90% of read traffic.
  4. Table Partitioning: Break massive tables into monthly/weekly partitions within the same database to keep index sizes small and make vacuuming manageable.

When you have implemented all of the above, and your primary database is still hitting 90% CPU utilization purely from write traffic, you have officially hit the Vertical Ceiling.

It is time to shard.

Learning Path: System Design Roadmap

Keep the momentum going

Step 2 of 10: Your next milestone in this track.

Next Article

NEXT UP

Database Sharding Part 2: Partitioning vs. Sharding

5 min readIntermediate

Learning Path: Databases Track

Keep the momentum going

Step 8 of 54: Your next milestone in this track.

Next Article

NEXT UP

Database Sharding Part 2: Partitioning vs. Sharding

5 min readIntermediate

📚

Recommended Resources

Designing Data-Intensive ApplicationsEssential

The go-to book for understanding databases, consistency, and distributed data.

View on Amazon
MongoDB — The Complete Developer's Guide — Udemy

Comprehensive MongoDB course from basics to advanced aggregations.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Continue Series

Database Sharding Mastery

Lesson 1 of 8 in this learning sequence.

Next in series
1

Intermediate

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 1: The Vertical Ceiling In the early days of a startup, database scaling is a solved problem. You go into the AWS RDS or GCP Cloud SQL console, select a larger instance type, click "Modify," and yo…

2

Intermediate

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 2: Partitioning vs. Sharding When engineering teams realize their database is struggling to keep up with load, the words "partitioning" and "sharding" are often thrown around interchangeably in arc…

3

Advanced

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 3: The Shard Key Blueprint Once you have accepted the architectural complexity of sharding (as discussed in Part 2), you are faced with the single most critical decision in distributed database des…

4

Advanced

Database Sharding Part 4: Consistent Hashing Internals

Database Sharding Part 4: Consistent Hashing Internals In Part 3, we successfully identified userid as our shard key. Now, we must write the mathematical algorithm that routes an incoming query containing userid: 1045 to…

5

Advanced

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 5: The Scatter-Gather Problem When you implement a sharded database architecture, you establish a primary Shard Key (as discussed in Part 3) to route your queries. If your Shard Key is userid, retr…

6

Advanced

Database Sharding Part 6: Zero-Downtime Re-sharding

Database Sharding Part 6: Zero-Downtime Re-sharding Eventually, your sharding strategy will need to change. Perhaps you outgrew your initial 4-shard cluster and need to expand to 16 shards. Or perhaps you realized your S…

7

Advanced

Database Sharding Part 7: Case Study - Scaling Discord to Billions

Database Sharding Part 7: Case Study - Scaling Discord to Billions We have spent the last 6 parts of this series exploring the theoretical mathematics, routing algorithms, and migration playbooks for distributed database…

8

Expert

Query Optimization: The Hidden Cost of Cross-Shard Joins

Query Optimization: The Hidden Cost of Cross-Shard Joins In a sharded database, the "Scatter-Gather" query is the silent performance killer. When you perform a join on columns that aren't the shard key, your proxy has to…

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

DatabasesIntermediate

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 2: Partitioning vs. Sharding When engineering teams realize their database is struggling to keep up with load, the words "partitioning" and "sharding" are often thrown around interchangeably in arc…

Apr 20, 20265 min read
Deep DiveDatabase Sharding Mastery
#database-scaling#partitioning#sharding
DatabasesExpert

Query Optimization: The Hidden Cost of Cross-Shard Joins

Query Optimization: The Hidden Cost of Cross-Shard Joins In a sharded database, the "Scatter-Gather" query is the silent performance killer. When you perform a join on columns that aren't the shard key, your proxy has to…

Apr 20, 20261 min read
Deep DiveDatabase Sharding Mastery
#sharding#database-optimization#queries
DatabasesAdvanced

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 5: The Scatter-Gather Problem When you implement a sharded database architecture, you establish a primary Shard Key (as discussed in Part 3) to route your queries. If your Shard Key is userid, retr…

Apr 20, 20265 min read
Deep DiveDatabase Sharding Mastery
#query-optimization#sharding#latency
DatabasesAdvanced

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 3: The Shard Key Blueprint Once you have accepted the architectural complexity of sharding (as discussed in Part 2), you are faced with the single most critical decision in distributed database des…

Apr 20, 20265 min read
PlaybookDatabase Sharding Mastery
#database-scaling#shard-key#system-design

More in Databases

Category-based suggestions if you want to stay in the same domain.