Cloudflare outage on July 17, 2020

“This configuration contained an error that caused all traffic across our backbone to be sent to Atlanta. This quickly overwhelmed the Atlanta router and caused Cloudflare network locations connected to the backbone to fail.”
Incident #24 at Cloudflare on 2020/07/17 by John Graham-Cumming (CTO)
Full report https://blog.cloudflare.com/cloudflare-outage-on-july-17-2020/
How it happened Responders mitigating an (unrelated congestion) issue updated the configuration on a router in one location (Atlanta), with the goal of alleviating congestion. That configuration contained a defect that caused all traffic across the global network to be routed through to that location, overwhelming that router and causing failures for all locations on that network.
Architecture A private backbone that carries traffic between different data centers, without going over the public internet.
Technologies
Root cause Configuration error on a router in one location, which inadvertently rerouted all traffic on that network ("backbone") through that router.
Failure Network router was overwhelmed and traffic to other locations was lost.
Impact Outage for company's' services that lasted 27 minutes; logs and metrics were lost at the data centers processing logs.
Mitigation The router with the defective configuration was disabled, shutting down the backbone in that location.