Reporting Data Delayed

Incident Report for Xandr

Postmortem

Incident Summary

From approximately 17:15 to 19:30 UTC on Saturday, December 28th reporting data was stale in excess of 6 hours.

Scope of Impact

During the incident window, clients were unable to pull reporting data for the hours between 10:00 UTC and 13:00 UTC.

Timeline (UTC)

2019-12-28 11:13 UTC: Incident Started. Data streaming application failure prevented data logs for certain hours from closing.

2019-12-28 14:38 UTC: Data logs for affected hours manually closed.

2019-12-28 19:30 UTC: Incident resolved: reporting data caught up.

Cause Analysis

The root cause of the incident was due to a hardware failure that affected the server on which the data streaming application relies.

Resolution Steps

The data logs were closed manually allowing for data to flow into reporting.

Next Steps

  • Revisit process for handling of unclosed data logs.
  • Improve the alert and escalation process for missing logs.
Posted Jan 16, 2020 - 01:25 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Dec 28, 2019 - 23:33 UTC

Identified

We have identified the following issue:

  • Component(s): Reporting
  • Impact(s):
    • Stale reporting data in excess of 6 hours starting at 16:00 UTC.
  • Severity: Major Outage
  • Datacenter(s): Global

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted Dec 28, 2019 - 20:01 UTC