The library. Every incident, structured.
A growing archive of public postmortems, broken down into a consistent shape: what broke, why it cascaded, and what to take from it. New incidents added regularly.
28+
incidents
11+
years
13
organizations
/
sort
2 results · filtered
topic: control-plane
id
incident
org
date
duration
severity
tags
FM-008
Cloudflare's Edge Stayed Up. Its Control Plane Went DarkA cascading power failure took out Cloudflare's primary control plane facility. The high-availability cluster did not survive the loss of one of its three sites, and the dashboard, API, and analytics went down while the data plane kept serving customer traffic.
Cloudflare
2023-11-02
~36h
SEV-1
datacenterpowercontrol-plane
FM-014
The Automation Bug That Took Google's Network Control Plane OfflineA bug in Google's datacenter maintenance automation descheduled the network control plane in multiple physical locations at once. BGP withdrew within minutes, and traffic flowed onto an oversubscribed fail-static path until engineers could rebuild the configuration.
Google Cloud
2019-06-02
4h 25m
SEV-1
networkingautomationbgp