Design infrastructure systems that power the modern internet: CDNs, monitoring pipelines, container orchestrators, and more.
4 problems
Design a content delivery network that serves 10 billion requests per day with sub-50ms p99 latency globally.
Design a distributed in-memory cache cluster supporting 1 million operations per second with sub-millisecond reads across 10TB of data.
Design a distributed session store handling 500M concurrent sessions with sub-5ms read latency and 99.99% availability.
Design a caching layer for a web application with a PostgreSQL backend, targeting <100ms p95 response times and >80% cache hit ratio within a 2GB memory budget.
4 problems
Design a distributed tracing system that propagates context across 5,000+ microservices, ingests 500K spans/sec, and supports trace lookup in under 2 seconds.
Design a health checking system that monitors 50 microservices across 200 instances with configurable check intervals and under 60 seconds from failure to alert.
Design a centralized log aggregation system that ingests 2 TB/day from 50,000+ hosts with sub-5-second search latency across 30 days of retained data.
Design a monitoring pipeline that ingests metrics from 10,000+ microservices at 5 million data points per second with sub-60s alert latency.
5 problems
Design the DNS system for a global consumer application: authoritative zones, geo-steering, health-checked records, anycast recursive resolvers for service discovery, and a control plane that propagates record changes worldwide in under 60 seconds.
Design an L4/L7 load balancer that handles 5 million concurrent connections with sub-millisecond routing overhead and 3-second failover.
Design an NGINX-like reverse proxy that routes requests across 10 backend servers at 5K req/sec with sub-2ms overhead.
Design a service mesh handling mTLS, traffic splitting, and observability across 2000+ microservices and 50K pods with sub-millisecond proxy overhead.
Design an API gateway handling 100K requests/sec across 500+ backend services with <5ms p99 latency overhead, supporting routing, auth, rate limiting, and request transformation.
4 problems
Design a chaos engineering platform that orchestrates 1,000+ simultaneous fault injections across a distributed fleet with sub-100ms injection latency and sub-1s kill switch response.
Design a circuit breaker library for a platform with 800+ microservices and 50K circuit instances that detects failures within a 10-second rolling window and transitions states in under 1ms.
Design a distributed rate limiter that enforces 10,000 requests/second per API key across 50+ globally distributed nodes with sub-1ms decision latency.
Design retry and timeout policies for a microservice calling 5 downstream services, achieving 99.9% success rate while keeping retry overhead under 10% extra load.
4 problems
Design a blue-green deployment system for a web application running on 20 servers with sub-30-second cutover, instant rollback, and zero downtime.
Design a CI/CD pipeline that supports 500+ microservices deploying 1,000+ times per day with automated canary analysis and sub-5-minute rollback.
Design a container scheduler that bin-packs 500K pods across 10K nodes with <1s scheduling latency and zero dropped workloads during node failures.
Design a feature flag platform serving 1,000+ microservices evaluating 10B flag checks/day with sub-millisecond local evaluation and sub-5-second global propagation.