How Azure manages its crisis communication in the event of an outage

Outages can be particularly damaging for cloud computing services. Microsoft knows this all too well and has been developing a sophisticated plan for many months to avoid this kind of inconvenience on its in-house service, Azure. Microsoft is using its Azure status page less and less to notify users of its cloud service outages.

Last March, when one of Microsoft’s most active regions, the eastern United States, was hit for hours on end, there was nothing on the status page – and very little protest on Twitter. It turns out that this relative calm is wanted. Microsoft has been working for some time to get its Azure users to report possible cloud service outages on its custom Service Health pages, rather than the Azure Status platform, which is aimed at the general public.

The Azure Support account on Twitter attempts to guide users to visit these pages and / or to send a message to this account when they need the most recent information about an outage. In a blog post this week, Sami Kubba, a senior program manager overseeing Azure’s outage communication process, described Microsoft’s situation and its goals for outage crisis communication. .

Avoid a deluge of complaints on the web

For the latter, Microsoft’s goal is to notify all affected Azure subscriptions within 15 minutes of an outage. Microsoft uses humans, plus push notifications, to do this. He noted that automatic notifications via Service Health were responsible for more than half of Microsoft’s communications regarding outages during the last quarter.

“We are also in the early stages of expanding our use of AI-based operations to automatically identify affected services and, after mitigation, send resolution communications (for supported scenarios) as quickly as possible. », Adds the manager. Microsoft currently uses the public Azure Status page only to communicate “generalized” outages, that is, those that affect multiple regions and / or multiple services, recognizes the latter.

Microsoft is working to make this same type of failure notification system consistent for its other cloud computing products, including Microsoft 365 and Power Platform, it notes. Customers of these services can already see the M365 Status account on Twitter, which allows users to be directed to their portals and send them messages if there is a problem.

Source : ZDNet.com

.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Trending