Delayed delivery of reports and log level data for 2020-05-27 03:00 UTC
Incident Report for Xandr
Postmortem

Incident Summary

At approximately 03:00 UTC on Monday, May 27th, 2020 a data processing application failure resulted in Log Level Data loss for the "2020-05-27 03:00" hour.

Scope of Impact

Inaccurate Log Level Data reports for the affected hour during the incident window.

Timeline (UTC)

2020-05-27 03:00: Incident Started: Data processing application failure
2020-05-27 07:24: Issue Identified
2020-05-27 08:40: Engineers initiate data reprocessing
2020-05-27 12:01: Incident Resolved: Reprocessing finished

Cause Analysis

The incident was caused by a defect in a program used for internal maintenance, causing a data processing application failure.

Resolution Steps

The engineers reprocessed the affected data. The recovered data did solve the issue.

Next Steps

  • Build better monitoring to detect future processing errors
  • Fix the internal maintenance program
  • Streamline the recovery process to allow faster engineering reactivity
Posted Jun 05, 2020 - 17:09 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted May 27, 2020 - 12:21 UTC
Monitoring

We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.

Posted May 27, 2020 - 10:35 UTC
Identified

We have identified the following issue:

  • Component(s): Log Level Data, Analytics reports
  • Impact(s):
    • Some data incomplete or incorrect until reprocessed (please repull data as necessary)
  • Severity: Partially Degraded
  • Datacenter(s): Global

Our engineers are actively working towards a resolution, and we will provide an update as soon as possible. Thank you for your patience.

Posted May 27, 2020 - 10:04 UTC