Incident
|
#4 at
Cloudflare on
2017/02/13 by John Graham-Cumming (Chief Technology Officer)
|
Full report
|
https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/
|
How it happened
|
A new module was added to the webserver which subtly changed its buffering and triggered the latent defect (in a second module) leading to the buffer overrun and corrupted webpages.
|
Architecture
|
NGINX webserver, with multiple custom modules, including a Ragel based HTML parser used to modify HTML.
|
Technologies
|
NGINX, Ragel
|
Root cause
|
A latent defect in how a custom webserver module checked for the end of a buffer.
|
Failure
|
Buffer overrun while preparing responses to some HTTP requests.
|
Impact
|
Webserver returned corrupted webpages which contained private information (cookies, authentication tokens, etc).
|
Mitigation
|
Deployed fix for end of buffer check. Requested external organizations to clear corrupted pages from caches.
|