Incident Summary
From approximately 16:33 UTC to 19:10 UTC on July 30th, we experienced traffic outages of >90% in SIN3 datacenter and >95% in LAX1. During this window, traffic in AMS1 was shifted to FRA1 after a 20-30% outage. From approximately 16:33 to 17:15 on the same day, there was a >85% traffic outage in the NYM2 datacenter
Scope of Impact
During the incident window, ad serving was disrupted across NYM2, LAX1, AMS1 and SIN3 datacenters.
Timeline (UTC)
2020-07-30 16:33: Incident Started and immediately escalated to engineering team
2020-07-30 19:10: ad-serving returned to normal
2020-07-31 01:04: Incident Resolved: release to correct issue in Impbus
Cause Analysis
An update to bidder uncovered an issue, causing Impbus to go down across four datacenters.
Resolution Steps
Our engineers resolved the issue by pushing a temporary fix and restarting impacted instances of Impbus.
Next Steps
Improve detection, monitoring and alerts for changes in Impbus traffic.
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We have resolved the adserving issue. We are monitoring an increase in data age, which delays object changes reaching the ad server.
We are currently investigating the following issue:
We will provide an update as soon as more information is available. Thank you for your patience.