An Amazon Web Services outage that caused massive disruption to the East Coast internet this week has been blamed on a typo punched in by one of the company’s workers.
The outage hit Amazon Web Services Tuesday, impacting lots of web pages on the East Coast as the cloud giant experienced problems with its Simple Storage Service (S3). Widely used for backup and archive, S3 is harnessed by a host of companies.
In a statement Amazon Web Services explained that a member of its S3 team was attempting to debug an issue on the S3 billing system. At 12.37 EST the worker executed a command that was intended to remove a small number of servers for one of the S3 subsystems used in the billing process. “Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended,” Amazon explained, in its statement. “The servers that were inadvertently removed supported two other S3 subsystems.”
The web giant noted that removing the systems caused each of them to require a full restart. S3 was operating normally again by 4:54 p.m. EST Tuesday, it said.
“We want to apologize for the impact this event caused for our customers,” said Amazon, in its statement. “While we are proud of our long track record of availability with Amazon S3, we know how critical this service is to our customers, their applications and end users, and their businesses. We will do everything we can to learn from this event and use it to improve our availability even further.”
In its statement, Amazon Web Services explained that it is making a number of changes as a result of the service disruption. These include modifying the tool used in the snafu to ensure that server capacity is removed more slowly. The web giant is also adding safeguards to prevent systems from being taken below their minimum required capacity levels.
The incident highlights the vital role that Amazon’s cloud technology plays for many businesses. “This brings to light the reality that all technology will fail eventually – even ones that are ‘too big to fail’ like Amazon,” said Shawn Moore, CTO at web experience specialist Solodev, in a statement emailed to Fox News, noting that not all businesses hosted on Amazon experienced problems. “The ones who have fully embraced Amazon’s design philosophy to have their website data distributed across multiple regions were prepared.”
Follow James Rogers on Twitter @jamesjrogers