Deep-dive into Linux process management, memory, filesystems, networking stack, and kernel internals at interview depth.
4 problems
Design a container runtime using Linux namespaces, cgroups v2, pivot_root, and seccomp — implementing the OCI runtime spec to run isolated workloads on a single host.
Design a process supervisor that manages 50+ services with health checks, restart policies, and graceful shutdown — similar to systemd or supervisord.
Explain the Linux CFS scheduler, nice values, real-time scheduling (SCHED_FIFO/RR), CPU affinity, NUMA awareness, and cgroup CPU bandwidth control.
Explain how Linux processes are created, scheduled, and terminated — covering fork/exec, process states (R/S/D/Z/T), zombie cleanup, and the /proc filesystem.
4 problems
Design a data processing pipeline handling 500 GB datasets on 64 GB RAM machines using mmap, huge pages, NUMA-aware allocation, and zero-copy I/O.
Diagnose and resolve Linux OOM killer events across a production fleet: oom_score_adj tuning, cgroup memory limits, overcommit policies, and memory leak detection.
Explain how Linux manages memory: virtual vs physical addressing, page tables, TLB, malloc/mmap, and key /proc/meminfo metrics.
Explain Linux page cache, swap mechanics, and tuning: swappiness, dirty page writeback, vm.dirty_ratio, readahead, and drop_caches behavior.
4 problems
Design a distributed filesystem with metadata servers and chunk servers handling 10 PB of storage, similar to GFS/HDFS.
Design an LSM-tree storage engine handling 500K writes/sec with compaction, WAL, and bloom filters. Similar to LevelDB/RocksDB.
Diagnose and resolve I/O bottlenecks using iostat, iotop, and blktrace. Cover I/O schedulers, direct vs buffered I/O, and RAID trade-offs.
Walk through the Linux VFS layer, inodes, dentries, and ext4 journaling. Explain how df, du, and lsof interact with the kernel.
4 problems
Design a Layer 7 network proxy handling 500K concurrent connections with sub-millisecond added latency using epoll, connection pooling, and zero-copy I/O.
Systematically diagnose network performance problems including TCP retransmissions, conntrack table overflow, MTU/MSS mismatches, and SYN floods using ss, tcpdump, and kernel counters.
Explain how eBPF transforms Linux networking with XDP for line-rate packet processing, tc BPF for traffic control, socket BPF for observability, and Cilium's eBPF-based service mesh.
Trace a packet's path through the Linux kernel from NIC to userspace socket, covering drivers, netfilter, iptables chains, network namespaces, and veth pairs.
3 problems
Design a fleet-wide kernel live patching system that applies security fixes to 10,000 servers without reboots, achieving <4 hour patch SLA with zero downtime.
Design a methodical performance investigation workflow using the USE method, perf, ftrace, and flame graphs to diagnose latency regressions on a 50-host service tier.
Explain the most critical sysctl parameters for production Linux servers, covering vm.swappiness, net.core.somaxconn, fs.file-max, and TCP tuning across a 10,000 RPS web tier.