Definition

A failure pattern where many clients simultaneously request the same resource — typically after a cache miss or service recovery — causing a sudden spike that overwhelms the backend. Also called a “cache stampede” or “dog-pile effect.”

Discussion

The classic sequence (from scaling-postgresql-openai): a caching layer fails → many requests simultaneously miss on the same keys → all hit the database at once → CPU saturates → query latency rises → client retries → load amplifies → cascade failure. The vicious cycle can bring down an entire service even if the original trigger was minor.

Cache locking (cache leasing) is the primary mitigation: when multiple requests miss on the same key, only one acquires a lock and fetches from the backend. All others wait for the cache to be repopulated. This converts N parallel DB hits into 1, breaking the cascade at its source.

Other mitigations used in practice:

  • Rate limiting at multiple layers (application, proxy, query) to shed load before it reaches the DB.
  • Retry backoff — overly short retry intervals amplify thundering herd; exponential backoff with jitter is standard.
  • Workload isolation — separate instances per tier so a stampede on one workload doesn’t cascade into another.

The pattern is not exclusive to caching: a retry storm after a timeout, or a surge of expensive queries after a new feature launch, follow the same amplification dynamic.

Key sources

ArticleWhat it contributes
scaling-postgresql-openaiCache locking as the fix; full vicious cycle pattern at ChatGPT scale
uber-intelligent-load-managementStatic CoDel timeouts caused thundering herd from simultaneous retries; Cinnamon’s PID control as the fix — smooth rejection vs. abrupt waves

PostgreSQL Scaling, Connection Pooling, Load Shedding