System DesignAdvancedarticle

System Design: Designing a Distributed Search Engine (Elasticsearch)

How does Elasticsearch search through billions of documents in milliseconds? A technical deep dive into Inverted Indexes, Sharding, and Segment Merging.

Sachin SarawgiApril 20, 20263 min read3 minute lesson

System Design: Designing a Distributed Search Engine

Search is the most common way humans interact with massive datasets. Building a system that can perform full-text search across billions of documents with millisecond latency requires a complete rethink of traditional database storage.

1. Core Requirements

  • Full-Text Search: Supporting complex queries, fuzzy matching, and ranking.
  • High Throughput: Thousands of search queries per second.
  • Real-time Indexing: New documents should be searchable within seconds.
  • Scalability: Handling petabytes of data across hundreds of nodes.

2. The Inverted Index (The Foundation)

A traditional database table is like a book's table of contents (ID -> Data). An Inverted Index is like the index at the back of a textbook (Word -> List of IDs).

  • The Process:
    1. Tokenization: Splitting text into individual words.
    2. Normalization: Lowercasing and removing punctuation.
    3. Indexing: Mapping each word to a "Postings List" of document IDs where it appears.

3. Storage Internals: Segments and Lucene

Elasticsearch is built on Apache Lucene.

  • Segments: An index is made of multiple immutable "Segments." Once a segment is written to disk, it never changes.
  • Merging: In the background, Lucene merges smaller segments into larger ones. This process (Segment Merging) is critical for performance and is where "deleted" documents are physically removed.

4. Distributed Architecture: Sharding

To scale, a single index is split into multiple Shards.

  • Primary Shard: Handles write operations.
  • Replica Shard: Provides high availability and handles read (search) traffic.
  • Routing: When a query arrives, the "Coordinator Node" broadcasts the search to all relevant shards, merges the results (Scatter-Gather pattern), and returns them to the user.

5. Ranking and Relevance (TF-IDF vs. BM25)

How does the engine know which document is most relevant?

  • Term Frequency (TF): How often does the word appear in this document?
  • Inverse Document Frequency (IDF): Is this word rare across the whole dataset?
  • BM25: The modern industry standard algorithm that builds on TF-IDF to provide highly accurate relevance scores.

6. Real-time Ingestion Pipeline

Elasticsearch doesn't fsync to disk for every document (it's too slow).

  1. Documents are written to an in-memory Indexing Buffer.
  2. Every 1 second (Refresh interval), the buffer is turned into a new Segment and made searchable.
  3. For durability, documents are also appended to a Translog (WAL).

Summary

Building a distributed search engine is about mastering the Inverted Index and managing the complex lifecycle of Immutable Segments. By combining these with a robust sharding strategy, you can build a system that powers everything from e-commerce search to petabyte-scale log analysis.

📚

Recommended Resources

Designing Data-Intensive ApplicationsBest Seller

The definitive guide to building scalable, reliable distributed systems by Martin Kleppmann.

View on Amazon
Kafka: The Definitive GuideEditor's Pick

Real-time data and stream processing by Confluent engineers.

View on Amazon
Apache Kafka Series on Udemy

Hands-on Kafka course covering producers, consumers, Kafka Streams, and Connect.

View Course

Practical engineering notes

Get the next backend guide in your inbox

One useful note when a new deep dive is published: system design tradeoffs, Java production lessons, Kafka debugging, database patterns, and AI infrastructure.

No spam. Just practical notes you can use at work.

Sachin Sarawgi

Written by

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

Keep Learning

Move through the archive without losing the thread.

Related Articles

More deep dives chosen from shared tags, category overlap, and reading difficulty.

More in System Design

Category-based suggestions if you want to stay in the same domain.