Reporting Data Delayed
Incident Report for Xandr
Postmortem

Incident Summary

From approximately 21:11 UTC on Wednesday, January 1st to 02:30 UTC on Thursday, January 2nd reporting data was stale in excess of 6 hours.

Scope of Impact

During the incident window, clients were unable to pull reporting data for the hours between 18:00 UTC and 21:00 UTC January 1st, 2020.

Timeline (UTC)

2020-01-01 19:06 UTC: Incident Started. Data streaming application failure prevented data logs for certain hours from closing.

2020-01-01 19:00 UTC: Data logs for affected hours manually closed.

2020-01-02 02:00 UTC: Incident resolved: reporting data caught up.

Cause Analysis

The root cause of the incident was due to a hardware failure that affected the server on which the data streaming application relies.

Resolution Steps

The data logs were closed manually allowing for data to flow into reporting.

Next Steps

  • Revisit process for handling of unclosed data logs.
  • Improve the alert and escalation process for missing logs.
Posted Jan 16, 2020 - 01:30 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Jan 02, 2020 - 02:24 UTC
Identified

We have identified the following issue:

  • Component(s): Reporting
  • Impact(s):
    • Stale reporting data in excess of 6 hours.
  • Severity: Major Outage
  • Datacenter(s): Global

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted Jan 02, 2020 - 00:05 UTC