Incident Summary
From approximately 10:00 UTC on Monday, March 16th to 02:00 UTC on Wednesday, March 18th an increase in impression volume caused reporting jobs to slow down, causing some lower priority reports to be delayed by up to 24 hours. The impression surge has continued to grow since, but we are now able to keep up.
Scope of Impact
During the incident window Network Analytics reports were delayed over six hours, Yieldex Analytics reporting and log level data reporting experienced delays, and some lower priority reporting jobs were delayed by as much as one day.
Timeline (UTC)
2020-03-16 10:00: Incident Started: Impression volume exceeded our prior record peak volume by approximately 20%
2020-03-16 17:00: Transacted impressions begin taking >1 hour to process
2020-03-17 16:00: Incident Reported
2020-03-18 02:00: Incident Resolved: 28 additional servers added
Cause Analysis
The root cause of the incident was due to a dramatic increase in traffic, causing heavy load to New York servers. This caused reporting job delays, including delays in processing transacted impressions. This in turn caused delays in reporting, with reports taking longer to become available.
Resolution Steps
Our engineering team resolved this issue by adding servers to handle the increased traffic.
Next Steps
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We have identified the following issue:
Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.