Platform Service Disruption - eCommerce
Incident Report for ACME Technologies
Postmortem

Dear clients,
I wanted to update you on the outages we had on January 24th, 2023 during a planned release. First of all, I sincerely apologize on behalf of the company. This was our error. From our Root Cause Analysis, two events contributed to the outage.

The first event was an issue with a database update on one of our large tables. Given the update applied to tens of millions of rows, the database locked up a specific transactional table which caused checkout to fail. The second event was an unnoticed code merge conflict that caused a front end component to fail, which impacted our ACME hosted eCommerce pages.

Moving forward, while we manually review database scripts, our release management process will be improved. First, we are improving our process to estimate the time it takes to run database scripts into production like size tables. This will help us determine in advance if a release requires downtime. Secondly, we are improving our smoke testing steps to catch what manual code reviews may not.

We continue to invest into our tooling and release management team size to improve our service availability. Clearly we did fall short of expectations in this release. The team has already started incorporating the process changes, via updating our Roll Out Plan procedures for the upcoming planned releases. We will ensure the same events do not happen again.

Echeyde Cubillo, Chief Technology Officer/Co-founder

Posted Feb 10, 2023 - 09:45 PST

Resolved
This outage affecting ACME eCommerce has been resolved. Details regarding the issue and the resolution will be shared after our internal review of the incident is complete.
Posted Jan 24, 2023 - 18:53 PST
Update
The fix to the ACME eCommerce platform has been applied and we see eCommerce traffic resuming. ACME engineers are monitoring the fis to ensure there are no additional issues.
Posted Jan 24, 2023 - 18:47 PST
Monitoring
ACME Engineers are appling a fix to resolve the issue affecting the ACME eCommerce platform. They are monitoring the fix to ensure the issue is resolved.
Posted Jan 24, 2023 - 18:33 PST
Identified
ACME Engineers have identified the issue affecting the ACME eCommerce platform and they are working on a fix to fully resolve.
Posted Jan 24, 2023 - 18:28 PST
Investigating
ACME is experiencing a service disruption that is impacting the ACME eCommerce (B2C) application(s).

Our engineering team is investigating the issue and we will update this incident as soon as possible.
Posted Jan 24, 2023 - 18:13 PST
This incident affected: ACME Platform (ACME eCommerce (B2C), ACME Backoffice (B2B), ACME Sales POS application).