Incident Summary
From approximately 16:15 to 17:27 UTC on Tuesday, April 27, 2021 Xandr experienced a reduction in ad serving capacity and an increase in bidder timeouts in the NYM2 datacenter.
Scope of Impact
During the incident, ad serving conducted out of the NYM2 datacenter was at reduced capacity and bidder timeouts were between 40-50%, which resulted in reduced ad serving for some clients.
Timeline (UTC)
2021-04-27 16:15: Incident Started: Bidders started to crash in NYM2
2021-04-27 16:35: Ad serving capacity reduced to 17% in NYM2
2021-04-27 16:52: Incident ticket created
2021-04-27 17:27: Incident Resolved: NYM2 was back to 100% capacity
Cause Analysis
The root cause of the incident was due to bad code with one user id. When that user id appeared in a bid request that caused the bidder to crash.
Resolution Steps
The offending user id was banned which caused the bidder instances to stabilize and ad serving capacity increased back to 100%.
Next Step(s)
• Implement more resilience in code to prevent one user id from causing a widespread crash.
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We are currently investigating the following issue:
We will provide an update as soon as more information is available. Thank you for your patience.