Design a health checking system that monitors 50 microservices across 200 instances with configurable check intervals and under 60 seconds from failure to alert.
## Problem
Design a health check system for an organization running 50 microservices across 200 instances. The system must determine whether each instance is alive, ready to serve traffic, and connected to its dependencies, then surface this information through a dashboard and alert on-call engineers when services degrade.
Sign up to access the full problem
Design canvas, rubric, hints, and model solutions.