Amazon Web Services Outage Causes Disruption for Thousands of Customers

 

On Tuesday, Amazon Web Services (AWS), the cloud computing unit of e-commerce behemoth Amazon, grappled with widespread outages that impacted multiple services, leaving thousands of customers facing difficulties for several hours.

The trouble began shortly before noon Pacific Time when users encountered problems accessing various services associated with AWS, such as authentication and sign-in errors. Amazon’s engineering teams swiftly launched an investigation and identified an issue within a subsystem connected to AWS Lambda, which directly and indirectly caused errors for customers.

AWS Lambda is a service that enables customers to execute computer programs without the need to manage underlying servers. Notably, several prominent companies, including T-Mobile, Netflix, and Autodesk, rely on Lambda, according to Reuters. The disruption also affected other Amazon services, including Amazon Music and Alexa.

Downdetector.com, a platform that aggregates user-submitted error reports to track outages, registered a peak of approximately 12,000 reports related to the problem. Frustrated users voiced their concerns and experienced interruptions in their operations during this period.

Around 2:00 p.m. Pacific Time, AWS announced that many of its services had been fully recovered and marked as resolved. The company assured customers that it was actively working to restore all services completely.

However, half an hour later, AWS acknowledged that it was still processing the backlog of asynchronous Lambda invocations that had accumulated during the outage, including invocations from other AWS services like Simple Queue Service (SQS) and EventBridge. The Lambda team dedicated efforts to handle these messages over the next few hours, cautioning users to anticipate ongoing delays in the execution of asynchronous invocations.

Finally, shortly before 3:40 p.m. Pacific Time, AWS provided a positive update, stating that the backlog had been entirely processed, and the issue had been resolved. The company assured customers that all AWS services were operating normally once again.

In the aftermath of the outage, Amazon shares experienced minimal movement during after-hours trading on Tuesday.

While the swift resolution of the issue by AWS is commendable, the disruption highlights the potential risks associated with widespread reliance on cloud computing services. The incident serves as a reminder for companies to consider contingency plans and backup systems to mitigate the impact of such outages on their operations.

As technology continues to evolve, ensuring the robustness and reliability of cloud services becomes increasingly crucial. Customers, in turn, must stay vigilant and proactive in monitoring and addressing any potential disruptions that may arise, ultimately safeguarding their own businesses and user experiences.

Comments
  • There are no comments yet. Your comment can be the first.
Add comment