DatabasesExpertarticlePart 1 of 3 in Database Internals

LSM-Tree Compaction Strategies: Leveled vs. Size-Tiered

Why does Cassandra write faster than MongoDB? Deep dive into LSM-tree compaction, read vs. write amplification, and storage engine trade-offs.

Sachin SarawgiApril 20, 20261 min read1 minute lesson

LSM-Tree Compaction Strategies

The storage engine is the heart of every high-performance NoSQL database (Cassandra, RocksDB, ScyllaDB). LSM-trees maximize write throughput by transforming random writes into sequential appends.

1. The LSM-Tree Core Mechanics

  1. Memtable: Writes are buffered in an in-memory balanced tree (SkipList).
  2. Commit Log: Writes are simultaneously appended to a sequential file on disk for durability.
  3. SSTable: Once the Memtable is full, it is flushed to disk as an immutable Sorted String Table.

2. Compaction: The Merging Process

Because LSM-trees generate many small files, they must be merged.

  • Size-Tiered Compaction (STCS): Merges SSTables of similar sizes. Excellent write throughput but high read amplification.
  • Leveled Compaction (LCS): Organizes files into levels. Files at L1 and L2 are disjoint (no overlapping keys). Excellent read performance, but high write amplification due to constant file movement.

3. Trade-offs

  • Write-Heavy: Use Size-Tiered.
  • Read-Heavy: Use Leveled.

Learning Path: Databases Track

Keep the momentum going

Step 21 of 54: Your next milestone in this track.

Next Article

NEXT UP

Hybrid Logical Clocks (HLC): Solving Distributed Time & Causality

2 min readAdvanced

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Continue Series

Database Internals

Lesson 1 of 3 in this learning sequence.

Next in series

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

More in Databases

Category-based suggestions if you want to stay in the same domain.