System Design

Cloud-Native Databases: Why the Log is the Database

Explore the architectural shift in cloud-native databases like Amazon Aurora. Learn why separating storage from compute and treating the log as the source of truth is the future.

Sachin Sarawgi·April 20, 2026·2 min read
#cloud-native#databases#amazon-aurora#distributed-systems#architecture

Cloud-Native Databases: The Log is the Database

Traditional databases (PostgreSQL, MySQL) were designed to run on a single machine where CPU and Disk are tightly coupled. In the cloud, this architecture hits a ceiling. Modern cloud-native databases like Amazon Aurora solve this by using a radical new design: The Log is the Database.

1. The Bottleneck: Page-Based Storage

In a traditional DB, if you modify one row, the engine must write the entire 8KB page to disk. It must also write to the Write-Ahead Log (WAL) for durability. This leads to massive Write Amplification and network congestion between the DB and its storage.

2. The Aurora Revolution: No More Pages

Amazon Aurora decouples Compute (the DB engine) from Storage.

  • When a write happens, the Compute node only sends the Log Record (the redo log) to the storage layer. It does not send full pages.
  • Result: 90% less network traffic compared to traditional MySQL on EBS.

3. Storage is Smart

In Aurora, the storage layer is not just a "dumb disk." it is a distributed, purpose-built service.

  • It receives the log records and applies them to its own in-memory pages in the background.
  • It handles backups, snapshots, and repairs independently of the main DB node.

4. Scaling Reads: The "Log-Only" Replication

Because the storage layer already has all the logs, adding a Read Replica in Aurora is nearly instantaneous. The replica just mounts the same storage and applies the incoming logs to its local cache. There is zero "Replication Lag" caused by disk I/O on the replica.

5. Why this matters for the Future

This "Log-first" architecture is now being adopted by other systems:

  • Kafka: Often used as a primary source of truth for event-sourced systems.
  • Neon/CockroachDB: Using similar decoupling to provide serverless SQL capabilities.

Summary

Treating the database as a stream of immutable logs rather than a collection of mutable pages is the defining shift of cloud-native data. It enables superior scalability, near-instant recovery, and the cost-efficiency that modern distributed systems demand.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon
Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon
Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Found this useful? Share it: