Cookie Data Update Error
Incident Report for Xandr
Postmortem

Incident Summary
From approximately 16:40 UTC on Wednesday, January 29th to 17:25 UTC on Wednesday, January 29th a cookie data store release implementing a new format for frequency data resulted in the loss of frequency counts on some active cookies during the incident window.

Scope of Impact
During the incident window 25-50% of cookied users who saw impressions during the incident window lost their associated frequency counts. As a result of this incident, some clients may observe a violation of some frequency caps for affected cookied users up to one month following this incident.

Timeline (UTC)
2020-01-29 16:40 UTC: Incident Started: Release made to cookie data store that overwrites frequency counts
2020-01-29 17:25 UTC: Release completed.
2020-01-29 19:40 UTC: Post-release monitoring detects cookie frequency anomalies
2020-01-29 20:40 UTC: Incident escalated.
2020-01-30 15:00 UTC: Incident Resolved: All datacenters confirmed back to normal

Cause Analysis
The root cause of the incident was due to a cookie data store release meant to update the cookie frequency data format. Cookie data store instances that were not on the new release registered cookies without frequency data in the old format, overwriting the new format and effectively resetting frequency counts for these users.

Resolution Steps
Following the release engineers stopped using the new frequency count format to overwrite frequency counts.

Next Steps

  • Ensure old and new data versions are not mixed during release(s)
  • Re-release of new format that retains old format
Posted Feb 06, 2020 - 22:30 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Feb 03, 2020 - 18:27 UTC
Monitoring

We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.

Posted Jan 30, 2020 - 00:42 UTC
Identified

We have identified the following issue:

  • Component(s): Userdata
  • Impact(s):
    • Some frequency caps may be intermittently violated
  • Severity: Minor Outage
  • Datacenter(s): Global

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted Jan 29, 2020 - 20:53 UTC