Data Disasters | Episode 2 - Microsoft meltdown shuts down email accounts worldwide

Data disaster! Environmental factors like excess heat, flooding and power outtages can wreak havoc on any business.The second in our series about the worst data centre and other infrastructure meltdowns in recent history deals with the greatest existential threat to your critical electronic assets - heat.


March 2013 – You've (NOT) Got Mail... Overheating shuts down Microsoft's email services for 16 hours 

Email, and the instant contact and ability to send rich content that it enables, has become such an integral part of business and social life that it's difficult to remember how we ever lived without it.

But thousands of users of Microsoft's Hotmail and Outlook.com email services were given a reminder of the snail-mail days when what was meant to be a routine firmware upgrade to servers caused a catastrophic temperature spike that forced the software giant to power off their infrastructure completely in order to prevent hardware damage.

Microsoft VP Arthur de Haan tried to downplay the event but it was undeniably bad timing for the company who were already copping flak for not being ready for 'the cloud'. In the wake of the failure, users took to Twitter to vent their frustration in posts tagged #hotmailfail and #hotmaildown.


The Lessons

This incident highlights the critical importance of monitoring for the ever-present danger of server failure due to overheating. No other factor presents such a singular danger to the proper operation of electronic and computing equipment found in data centres and server rooms.

It's for this reason that our base units come complete in-built temperature sensors. Single-site monitoring, esp

ecially in larger server rooms, is not enough on its own though. Best practice, in line with recommendations of ASHRAE and others, states that temperature probes should be placed at a minimum of 3 points per server rack.

One SensorGateway plus 2 connected temperature probes can achieve this simply and cost-effectively. As per the diagram, one probe (the SensorGateway, for example) is placed at the bottom front where cool air enters, one at the top front of the rack to ensure the cool air is reaching the highest point, and the third at the top back of the rack (usually the hottest point).

 


Keep your eyes peeled for Episode 3 and hear the red hot tale of how an electrical fire took out the internet in Azerbaijain. ALL of the internet...

Leave a comment

Comments have to be approved before showing up