Diagnose and resolve thundering herd problems across multiple scenarios: cache stampede after key expiry, connection storm after service restart, and lock convoy on hot rows, using request coalescing, jittered expiry, and probabilistic early recomputation.
## Problem
Your service uses a Redis cache in front of a PostgreSQL database. When a popular cache key expires, hundreds of concurrent requests simultaneously hit the database, overwhelming it and causing cascading latency for all users. This happens every 60 seconds when the hot key TTL expires. Additionally, after a service restart, all instances simultaneously reconnect to the database, causing a connection storm. Diagnose these thundering herd variants and design solutions.
Sign up to access the full problem
Design canvas, rubric, hints, and model solutions.
Senior · Troubleshooting