Delayed Analytics Reporting
Incident Report for Xandr
Postmortem

Incident Summary

From approximately 10:00 UTC on Friday, October 1st, 2021, to 15:56 UTC on Saturday, October 2nd, 2021, the deployment of a release prevented reporting data warehouse jobs from executing, causing delays to analytics reporting.

Scope of Impact

During the incident window, clients experienced delays in analytics reporting of up to 15 hours.

Timeline (in UTC)

2021-10-01 10:00: Incident Started: Release was deployed to production
2021-10-01 21:28: Engineers notified of failed jobs, causing SLA breach for network analytics reporting
2021-10-02 01:00: Incident ticket created
2021-10-03 15:56: Fix was deployed
2021-10-04 21:52: Incident Resolved

Cause Analysis

A release caused reporting data warehouse jobs to fail and reporting delays went beyond our SLA

Resolution Steps

Once the issue was identified, our engineers released a fix and this resolved the reporting delays

Next Steps
• Improve alerting
• Improve release plan

Posted Oct 19, 2021 - 16:44 UTC

Resolved

The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.

Posted Oct 02, 2021 - 16:59 UTC
Monitoring

We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.

Posted Oct 02, 2021 - 01:26 UTC
Investigating

We are currently investigating the following issue:

  • Component(s): API, UI
  • Impact(s):
    • Slower report retrieval
  • Severity: Major Outage
  • Datacenter(s): Global

We will provide an update as soon as more information is available. Thank you for your patience.

Posted Oct 02, 2021 - 01:14 UTC