From approximately 20:30 UTC on Thursday, Sep 14 to 22:15 UTC on Thursday, Sep 14, 2023, users were unable to access buyside API/UI pages, during the impact window.
The issue was caused due to a runtime error that resulted in out-of-memory error, due to the low memory limits that was set in the API containers when hyperthreading was introduced in the datacentre. As a result, the pods were OOM killed, and no pods remained operational to handle the traffic for a sufficient duration to effectively handle the traffic load.
The issue was resolved by increasing the memory limit of the API pods to accommodate the heightened memory demands in a hyperthreaded environment. The engineering team also collaborated to optimize the server configuration to effectively manage hyperthreading without necessitating additional memory usage.
The incident has been fully resolved. We apologize for the inconvenience this issue may have caused, and thank you for your continued support.
We have patched the issue and are monitoring our systems closely. We will provide an update as soon as the issue has been fully resolved.
We are currently investigating the following issue::
Status: We will provide an update as soon as more information is available. Thank you for your patience.