SeniorSystems Design45m·Google, Amazon, Meta, Microsoft, Dropbox

Design a Distributed Filesystem

Design a distributed filesystem with metadata servers and chunk servers handling 10 PB of storage, similar to GFS/HDFS.

distributed systemsfilesystemreplicationconsistencystorage

## Problem

Design a distributed filesystem that can store 10 PB of data across 1,000 commodity servers and serve large sequential reads at 100 GB/sec aggregate throughput. The system will be used as the storage layer for a data processing platform (similar to MapReduce/Spark), where workloads consist primarily of large files (100 MB to 10 GB) that are written once and read many times.

Sign up to access the full problem

Design canvas, rubric, hints, and model solutions.

Get Started Free Sign In

Constraints

Storage CapacitySign up to view

ThroughputSign up to view

Chunk SizeSign up to view

ReplicationSign up to view

Metadata LatencySign up to view

Design a Distributed Filesystem

Constraints

Related Problems