Systematically diagnose high CPU usage on a production Linux server, distinguishing between user, system, and steal time, using top, perf, and flame graphs to identify the offending process and code path.
## Problem
Your monitoring alerts that a production web server is at 95%+ CPU utilization sustained for the last 15 minutes. Request latency has increased 10x and the service is breaching its SLO. Walk through your diagnosis process from initial triage to identifying the specific code path or query consuming CPU.
Sign up to access the full problem
Design canvas, rubric, hints, and model solutions.