A failure of Microsoft’s Azure cloud computing platform, which occurred on 19 November, was caused by an operator error, Microsoft announced on its Azure blog on Wednesday.
The error had the Blob (Binary Large Objects) front-ends going into an infinite loop, not allowing it to take traffic.
“Unfortunately the issue was widespread since the update was made across most regions in a short period of time due to operational error, instead of following the standard protocol of applying production changes in incremental batches,” Microsoft corporate vice-president Jason Zander wrote on the Azure blog.
Azure storage services were temporarily unavailable across multiple regions.
“During the rollout we discovered an issue that resulted in storage blob front ends going into an infinite loop, which had gone undetected during flighting (testing an update before applying it, author’s remark).
“The net result was an inability for the front ends to take on further traffic, which in turn caused other services built on top to experience issues,” wrote Zander.
Even though problems in the cloud like this one are constantly reducing, fears among businesses, big and small, are still present.
“Any serious IT team using cloud will not use it for important information, unless it has backup protections for the data, which can undermine the business case for cost reduction by using cloud,” said Paul Hinton, commercial technology partner at law firm Kemp Little.
The Azure service was not fully operational for more than 11 hours, meaning this single error has cost Microsoft millions of pounds in lost revenue.