DatabasesIntermediateguidePart 1 of 8 in Database Sharding Mastery

Database Sharding Part 1: The Vertical Ceiling

When should you stop buying bigger servers? Learn to identify the physical limits of vertical scaling and why sharding is your last resort.

Sachin Sarawgi•April 20, 2026•5 min read•5 minute lesson

#database-scaling #sharding #performance #architecture

On This PageOpen

The Physical Limits of Hardware
The I/O Bottleneck
The Operational Nightmare
1. The Vacuuming Problem
2. The Index Rebuild
3. The Backup and Restore Window
The Concurrency Bottleneck (Latch Contention)
When is Sharding the Answer?

Database Sharding Part 1: The Vertical Ceiling

In the early days of a startup, database scaling is a solved problem. You go into the AWS RDS or GCP Cloud SQL console, select a larger instance type, click "Modify," and your application is suddenly fast again.

This process—scaling by adding more CPU, RAM, and Disk I/O to a single machine—is known as Vertical Scaling.

Vertical scaling is glorious. It requires zero application code changes, zero migration downtime (with modern failover), and preserves the beautiful, monolithic ACID guarantees of your relational database. But eventually, every hyper-growth application hits a physical limit. We call this the Vertical Ceiling.

In Part 1 of this series, we will dissect the physical, architectural, and operational ceilings of single-node databases, and define exactly when you must abandon vertical scaling and embrace sharding.

The Physical Limits of Hardware

You might think, "I'll just keep buying bigger servers." But the cloud providers have limits.

As of this writing, if you attempt to scale a PostgreSQL database on AWS, the largest instance you can provision (like a db.x2iedn.32xlarge) tops out at 128 vCPUs and 4,096 GiB of RAM.

While 4TB of RAM sounds virtually infinite for a startup, for an application processing millions of events per second, it is remarkably small. More importantly, simply adding cores and RAM yields diminishing returns due to NUMA (Non-Uniform Memory Access) architecture.

The NUMA Penalty

On massive servers, CPUs are divided into "nodes." Each CPU has local memory. If CPU 1 needs to read memory attached to CPU 4, the request must traverse the interconnect bus. As you scale vertically to 128 cores, the overhead of memory cross-talk increases exponentially. Doubling the cores rarely doubles the database throughput; it often only increases it by 30-40%.

The I/O Bottleneck

Even if you have infinite CPU, you do not have infinite disk speed.

Relational databases use Write-Ahead Logging (WAL). Every time you INSERT or UPDATE a row, the database must write that transaction to the WAL on disk before acknowledging the commit to the user. This is what guarantees durability.

If you are using network-attached storage like AWS EBS, you are fundamentally constrained by the network. AWS io2 Block Express volumes max out at 256,000 IOPS and 4,000 MB/s throughput.

When your application is ingesting 300,000 telemetry events per second, your EBS volume will physically throttle the database. At this point, the Vertical Ceiling is hard-capped by the physics of fiber-optic network cables in the AWS datacenter.

The Operational Nightmare

Let's assume you somehow bypass the CPU and I/O limits. You now have a 30 Terabyte PostgreSQL database sitting on a single monolithic instance. You have just entered the operational nightmare phase.

1. The Vacuuming Problem

PostgreSQL uses MVCC (Multi-Version Concurrency Control). When you update a row, it creates a new version and marks the old one as dead. A background process called autovacuum must clean up these dead rows. On a 30TB database with high write churn, the vacuum process can take days to run, eating up disk I/O and causing severe table bloat.

2. The Index Rebuild

Imagine needing to create a new index on a users table with 5 billion rows. Even using CREATE INDEX CONCURRENTLY, this operation will take weeks. If the index creation fails halfway through, you have to start over. Schema migrations become a high-stakes, multi-week anxiety attack.

3. The Backup and Restore Window

If your single massive database gets corrupted, you must restore it from a snapshot. Restoring a 30TB volume and rolling forward the transaction logs can easily result in 12+ hours of complete production downtime.

The Blast Radius

A monolithic database means a monolithic blast radius. If one bad analytical query accidentally bypasses an index and performs a full table scan on 5 billion rows, it will evict the entire working set from RAM, spiking CPU to 100%, and taking down the entire application for all users globally.

The Concurrency Bottleneck (Latch Contention)

Finally, there is internal database contention. When 10,000 active connections attempt to update different rows that happen to live on the same physical 8KB memory page, the database must lock that memory page. This is known as latch contention.

No amount of CPU or RAM can solve this. The database engine itself becomes the bottleneck because thousands of threads are waiting in line to modify the same memory block.

When is Sharding the Answer?

Sharding is the process of splitting your monolithic database into multiple, independent databases (shards), each responsible for a subset of the data.

Because sharding breaks relational constraints (foreign keys, global secondary indexes, and global transactions) and vastly increases application complexity, it must be your absolute last resort.

Before you shard, you must exhaust these alternatives:

Vertical Scaling: Max out the RDS instance size until the ROI diminishes.
Read Replicas: Offload all SELECT queries to 5-10 read replicas.
Caching: Put Redis or Memcached in front of the database to absorb 90% of read traffic.
Table Partitioning: Break massive tables into monthly/weekly partitions within the same database to keep index sizes small and make vacuuming manageable.

When you have implemented all of the above, and your primary database is still hitting 90% CPU utilization purely from write traffic, you have officially hit the Vertical Ceiling.

It is time to shard.

Learning Path: System Design Roadmap

Keep the momentum going

Step 2 of 10: Your next milestone in this track.

NEXT UP

Database Sharding Part 2: Partitioning vs. Sharding

5 min read • Intermediate

Learning Path: Databases Track

Keep the momentum going

Step 8 of 54: Your next milestone in this track.

NEXT UP

Database Sharding Part 2: Partitioning vs. Sharding

5 min read • Intermediate

📚

Recommended Resources

Designing Data-Intensive ApplicationsEssential

The go-to book for understanding databases, consistency, and distributed data.

View on Amazon →

MongoDB — The Complete Developer's Guide — Udemy

Comprehensive MongoDB course from basics to advanced aggregations.

View Course →

Practical engineering notes

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

LinkedIn GitHub Medium More articles

Share this lesson

Share on X Share on LinkedIn

Continue Series

Database Sharding Mastery

Lesson 1 of 8 in this learning sequence.

Next in series

Intermediate

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 1: The Vertical Ceiling In the early days of a startup, database scaling is a solved problem. You go into the AWS RDS or GCP Cloud SQL console, select a larger instance type, click "Modify," and yo…

Intermediate

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 2: Partitioning vs. Sharding When engineering teams realize their database is struggling to keep up with load, the words "partitioning" and "sharding" are often thrown around interchangeably in arc…

Advanced

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 3: The Shard Key Blueprint Once you have accepted the architectural complexity of sharding (as discussed in Part 2), you are faced with the single most critical decision in distributed database des…

Advanced

Database Sharding Part 4: Consistent Hashing Internals

Database Sharding Part 4: Consistent Hashing Internals In Part 3, we successfully identified userid as our shard key. Now, we must write the mathematical algorithm that routes an incoming query containing userid: 1045 to…

Advanced

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 5: The Scatter-Gather Problem When you implement a sharded database architecture, you establish a primary Shard Key (as discussed in Part 3) to route your queries. If your Shard Key is userid, retr…

Advanced

Database Sharding Part 6: Zero-Downtime Re-sharding

Database Sharding Part 6: Zero-Downtime Re-sharding Eventually, your sharding strategy will need to change. Perhaps you outgrew your initial 4-shard cluster and need to expand to 16 shards. Or perhaps you realized your S…

Advanced

Database Sharding Part 7: Case Study - Scaling Discord to Billions

Database Sharding Part 7: Case Study - Scaling Discord to Billions We have spent the last 6 parts of this series exploring the theoretical mathematics, routing algorithms, and migration playbooks for distributed database…

Expert

Query Optimization: The Hidden Cost of Cross-Shard Joins

Query Optimization: The Hidden Cost of Cross-Shard Joins In a sharded database, the "Scatter-Gather" query is the silent performance killer. When you perform a join on columns that aren't the shard key, your proxy has to…

Keep Learning

Move through the archive without losing the thread.

Database Sharding Part 2: Partitioning vs. Sharding

Databases5 min readIntermediate

Optimistic vs. Pessimistic Locking: Concurrency Control in Practice

Optimistic vs. Pessimistic Locking Imagine two users trying to book the last seat on a flight at the same time. If both read the count as "1" and decrement it, you've oversold the flight. This is the Lost Update Problem,…

Databases2 min readIntermediate

More deep dives chosen from shared tags, category overlap, and reading difficulty.

DatabasesIntermediate

More in Databases

Category-based suggestions if you want to stay in the same domain.

DatabasesBeginner

Bloom Filters: The Speed Secret of Modern NoSQL Databases

Bloom Filters: Avoiding the Disk Bottleneck In high-performance databases like Cassandra, RocksDB, and BigTable, the biggest performance killer is unnecessary disk I/O. When you query for a key that doesn't exist, the da…

Apr 20, 20262 min read

Deep Dive

#databases#bloom-filters#nosql

DatabasesAdvanced

Cassandra Gotchas: Dealing with Tombstones and Wide Partitions

Cassandra Gotchas: Managing Distributed Scale Cassandra is built for extreme availability, but its "append-only" storage model (LSM-trees) introduces specific behaviors that can catch developers off guard. Here are the m…

Apr 20, 20263 min read

Deep Dive

#cassandra#databases#performance

DatabasesAdvanced

Cassandra Internals: LSM-Trees, Gossip, and Eventual Consistency

Cassandra Internals: Built for Scale Apache Cassandra is a peer-to-peer distributed database designed to handle massive amounts of data across many commodity servers. Its "Masterless" architecture and high write throughp…

Apr 20, 20262 min read

Deep Dive

#cassandra#databases#lsm-trees

← Back to all articles

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 1: The Vertical Ceiling

The Physical Limits of Hardware

The I/O Bottleneck

The Operational Nightmare

1. The Vacuuming Problem

2. The Index Rebuild

3. The Backup and Restore Window

The Concurrency Bottleneck (Latch Contention)

When is Sharding the Answer?

Keep the momentum going

Keep the momentum going

Recommended Resources

Get the next backend guide in your inbox

Sachin Sarawgi

Database Sharding Mastery

Database Sharding Part 1: The Vertical Ceiling

Database Sharding Part 2: Partitioning vs. Sharding

Database Sharding Part 3: The Shard Key Blueprint

Database Sharding Part 4: Consistent Hashing Internals

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 6: Zero-Downtime Re-sharding

Database Sharding Part 7: Case Study - Scaling Discord to Billions

Query Optimization: The Hidden Cost of Cross-Shard Joins

Keep Learning

Database Sharding Part 2: Partitioning vs. Sharding

Optimistic vs. Pessimistic Locking: Concurrency Control in Practice

Related Articles

Database Sharding Part 2: Partitioning vs. Sharding

Query Optimization: The Hidden Cost of Cross-Shard Joins

Database Sharding Part 5: The Scatter-Gather Problem

Database Sharding Part 3: The Shard Key Blueprint

More in Databases

Bloom Filters: The Speed Secret of Modern NoSQL Databases

Cassandra Gotchas: Dealing with Tombstones and Wide Partitions

Cassandra Internals: LSM-Trees, Gossip, and Eventual Consistency