Microsoft is working to fully restore key services in its Azure cloud platform after an outage in a US region.
Since 9am GMT on Tuesday, Azure services have been intermittently unavailable to customers of Azure’s South Central US region, due to an outage affecting Azure Storage and the wide range of services that rely on it, including Machine-Learning Studio, Cloud Services, Database for SQL, Functions and more.
While the outage initially resulted in online Microsoft services such as Office 365 and Azure Active Directory being unavailable for some customers in other regions, access has now been restored, with the problems attributed to congestion caused by automated failover procedures.
As of 11am GMT on Wednesday, some Azure services in the South Central US region remain affected, although Microsoft’s support site says “engineers have restored storage availability for the majority of impacted services, and most customers should now be seeing signs of recovery”.
SEE: Information security policy (Tech Pro Research)
Azure engineers are now working on recovering impacted Azure Storage scale units and the remaining storage-dependent services in South Central US, work Microsoft says is mostly complete.
Microsoft blamed the outage on a “severe weather event”, which it said included lightning strikes that occured near one of the South Central US datacenters.
“This resulted in a power voltage increase that impacted cooling systems. Automated datacenter procedures to ensure data and hardware integrity went into effect and critical hardware entered a structured power down process,” according to the Azure status update.
An infrastructure outage at one of the major cloud platforms can take down the many of the high-level services these platforms offer, despite the redundancy and automated recovery built into these platforms.
In March last year, a wide range of Amazon Web Services offerings in the US-EAST-1 region were unavailable after an engineer inadvertently took a large number of Amazon Amazon Simple Storage Service (S3) servers offline.
While cloud outages hit the headlines because of the large numbers of people affected, some analysts advise the advantages of the public cloud over in-house infrastructure offset the risk of an occasional hiccup.
Modern firms are so reliant on the major cloud platforms that earlier this year insurance provider Lloyd’s warned the failure of a top cloud service for just three days could cost the US economy $15bn.