Diagnose and resolve a memory leak in a production service where RSS grows linearly until OOM kills trigger, using heap profiling, cgroup memory accounting, and systematic elimination.
## Problem
Your team's Go API service is experiencing a steady memory leak. RSS grows by approximately 200 MB per hour from a baseline of 800 MB after restart. When RSS reaches the 4 GB cgroup limit, the OOM killer terminates the process, causing a 30-second restart gap during which requests are dropped. This happens roughly every 16 hours. Describe your approach to diagnosing and fixing this leak.
Sign up to access the full problem
Design canvas, rubric, hints, and model solutions.