The library. Every incident, structured.
A growing archive of public postmortems, broken down into a consistent shape: what broke, why it cascaded, and what to take from it. New incidents added regularly.
28+
incidents
11+
years
13
organizations
/
sort
1 result · filtered
topic: re-mirroring
id
incident
org
date
duration
severity
tags
FM-013
The EBS Self-Repair Storm That Couldn't Stop ItselfA network change in US-East shifted traffic to a low-capacity path. EBS nodes that lost their replication connection began re-mirroring at the same time, exhausting free capacity and stranding volumes in a loop the cluster could not exit on its own.
AWS
2011-04-21
~4 days
SEV-1
ebsrdsus-east