Incident Summary
On Tuesday, Nov 11, 2021 at approximately 16:43 UTC, a combination of load balancing issues and a bug tied to authorization caused intermittent 403s, causing API calls to fail in the fetching of brand data and impacting user workflow.
Scope of Impact
Users may have experienced intermittent 403 errors, which affected API calls to retrieve brand data.
Timeline (UTC)
2021-11-23 16:43 Incident Started: Customer reports of receiving 403s when calling /brand endpoint.
2021-11-24 14:48 Incident Ticket Created
2021-11-24 16:48 Escalated to Engineering
2021-11-24 16:58 Load balancing issue identified
2021-11-24 17:29 Mitigation of unbalanced traffic completed
2021-11-24 23:08 Bug in logic handling database cache refresh identified
2021-11-26 19:07 Load balancing issues determined to be underlying cause. Additional hardware brought online to address
2021-11-30 14:11 Additional client reports of intermittent errors
2021-12-03 17:27 Incident Resolved: Intermittent 403's mitigated
Cause Analysis
The incident was caused by an API bug impacting authorization in conjunction with a load balancing issue.
Resolution Steps
Our engineers resolved the issue by implementing a fix to the authorization service. Increasing bandwidth in the AMS datacenter helped mitigate the issue further. Additional research into properly load balancing services is ongoing as of now.
Next Steps
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We have identified the cause of the issue, and our engineers are actively working towards a resolution. We will provide an update as soon as possible. Thank you for your patience.
We are currently investigating the following issue:
We will provide an update as soon as more information is available. Thank you for your patience.