Impression Bus Latency for SSP Traffic
Incident Report for Xandr
Postmortem

Incident Summary
From approximately 15:21 UTC to 18:36 UTC on Friday, March 20th, 2020, high CPU utilization on our LAX servers resulted in latency and time outs for some external supply partners.

Scope of Impact

Some external supply partners experienced latency of up to one second and a time out rate of approximately 15%.

Timeline (UTC)

2020-03-20 15:21: Incident Started: Latency issue was reported

2020-03-20 15:51: Incident Identified: Capacity issue in LAX1 was found to be the root cause behind the latency and we started working on adding more capacity

2020-03-20 16:36: Incident Resolved: Four new capacity load balancers are added to the LAX1 datacenter, resolving the latency issue

Cause Analysis
The root cause was found out to be a capacity issue with our LAX datacenter servers. The available hardware was being used at its full capacity.

Resolution Steps
This issue was resolved by adding four more servers in the LAX datacenter to the necessary collections.

Next Steps

  • Make additional servers available to mitigate potential impacts of traffic increase
Posted Mar 27, 2020 - 19:31 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Mar 20, 2020 - 20:19 UTC
Identified

We have identified the following issue:

  • Component(s): Ad Serving
  • Impact(s):
    • Drop in delivery on external supply
  • Severity: Minor Outage
  • Datacenter(s): LAX1

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted Mar 20, 2020 - 17:24 UTC