System Design: Designing a Distributed File System (HDFS)

A Distributed File System (like HDFS or GFS) is designed to store massive datasets across a cluster of commodity servers. It handles the complexity of breaking large files into chunks, replicating those chunks, and managing hardware failures without interrupting access.

1. Core Requirements

Massive Storage: Storing petabytes of data.
Reliability: Data must survive disk/node failures.
Sequential Access: Optimized for large, sequential file reads.
Throughput: High bandwidth for batch processing (MapReduce/Spark).

2. High-Level Architecture

NameNode (Metadata Server): Keeps track of file structure (namespaces), which chunks make up each file, and where those chunks reside. It's the "brain."
DataNodes (Storage Servers): The physical machines where the actual file chunks are stored. They handle read/write requests from clients.

3. Data Chunking & Replication

Chunking: Files are broken into fixed-size chunks (e.g., 128MB). This size is intentionally large to reduce metadata overhead and optimize sequential reads.
Replication: Each chunk is replicated (usually 3 times). To ensure fault tolerance, replicas are placed in different racks/AZs so that a single rack failure doesn't lose the entire chunk.

4. The NameNode Bottleneck

The NameNode stores metadata in memory to keep lookups fast. But what if we have billions of files?

Federation: Split the file system into multiple namespaces, each managed by its own NameNode.
Caching: Frequently accessed metadata is cached, but the NameNode remains the absolute source of truth.

5. Heartbeats and Failure Recovery

Heartbeats: DataNodes send heartbeats to the NameNode every 3-5 seconds.
Recovery: If the NameNode stops receiving heartbeats from a DataNode, it marks all chunks on that node as "under-replicated" and commands other healthy nodes to create new replicas immediately.

Summary

Building a distributed file system is about Abstraction. By separating metadata management (NameNode) from raw storage (DataNodes) and using massive block sizes, you can build a storage engine that scales to thousands of nodes and petabytes of data while appearing as a single, simple directory structure to the user.

Sachin Sarawgi

Engineering Manager and backend engineer with 10+ years building distributed systems across fintech, enterprise SaaS, and startups. CodeSprintPro is where I write practical guides on system design, Java, Kafka, databases, AI infrastructure, and production reliability.

System Design: Designing a Distributed File System (HDFS/GCS Style)

System Design: Designing a Distributed File System (HDFS)

1. Core Requirements

2. High-Level Architecture

3. Data Chunking & Replication

4. The NameNode Bottleneck

5. Heartbeats and Failure Recovery

Summary

Sachin Sarawgi

Keep Learning

System Design: Designing a Distributed ID Generator (Snowflake)

System Design: Designing a Distributed File Lock (Zookeeper/Curator)

Related Articles

System Design: Designing Google Drive (Distributed File Storage)

System Design: Designing an Object Store (Amazon S3 Internals)

System Design: Designing an Ad Click Aggregator

System Design: Designing Airbnb (Hotel/Home Booking)

More in System Design

System Design: Designing Stateless Authentication

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture

System Design: Designing a Distributed File System (HDFS/GCS Style)

System Design: Designing a Distributed File System (HDFS)

1. Core Requirements

2. High-Level Architecture

3. Data Chunking & Replication

4. The NameNode Bottleneck

5. Heartbeats and Failure Recovery

Summary

Get the next backend guide in your inbox

Sachin Sarawgi

Keep Learning

System Design: Designing a Distributed ID Generator (Snowflake)

System Design: Designing a Distributed File Lock (Zookeeper/Curator)

Related Articles

System Design: Designing Google Drive (Distributed File Storage)

System Design: Designing an Object Store (Amazon S3 Internals)

System Design: Designing an Ad Click Aggregator

System Design: Designing Airbnb (Hotel/Home Booking)

More in System Design

System Design: Designing Stateless Authentication

gRPC vs REST: The Decision-Maker's Guide for Backend Architecture

gRPC vs REST: A Decision-Maker's Guide for Backend Architecture