If you haven't heard in the news recently British Airways had an IT meltdown last weekend causing thousands of passengers to be grounded.
Some points to take away for your network and IT Infrastructure:
Everything critical should have dual power supplies. The incident currently is being blamed on a power surge or cut. If either of these were to happen, this should not have caused any issues. Each rack should have a PDU for UPS and PDU for UN-UPS power meaning you are protected from either of these
Highly Critical devices should be in a highly available state! Whether this be a server using Vmware Highly Available option, or 2 sets of routers in automatic HA, technology makes this easy to implement and can be shared across production and Diaster Recovery Site.
You should have a plan for Diasters - Another blame for the recent outage was due to lack of personnel on the ground to fix and manage the problem. If you have mission critical devices to your business you should have a proper monitoring and oncall person 24/7 so your TTF ( Time to Fix ) is reduced due to knowing about the problem 5 minutes after this has happened.