General discussion

Locked

Data Center scheduled power outages - How do you handle them?

By rlcallison ·
In our Data Center, we have a completely independant "A side" and "B side" power source. For equipment that has a Single Power Supply or (3) Power Supplies, we have implemented APC Transfer Switches that connect to both the A & B side feeds so we can maintain and N+1 redundancy. Through recent testing, we proved that you can shut off the A or B side and the opposite will support the entire Data Centers power load.

Given this design, our Facilities department would like to work on new power connections and/or Annual Preventative Maintenance to floor PDU's (Power Distribution Units) Cold. Meaning that they could turn off the "A PDU" and connect up new items or complete Annual PM's while the "B PDU" is still providing power to the room. They state that this would make the job Safer and that it would prevent any potential problems while hooking up 225a cables.

While I completely agree with them, there are a couple of folks in our organization that don't. They would like the work to be done "hot". Their feeling is that we would intentionally be removing the N+1 redundancy during that timeframe and wouldn't have anything to fail over to if needed.

I would like to hear from others on how they handle this type of work in their Data Centers. I would also like to see a count for the frequency of these type of scheduled outages per year.

If you need more specific information, feel free to email me at rlcallison@cmsenergy.com

Thanks!

Bob

This conversation is currently closed to new comments.

7 total posts (Page 1 of 1)  
| Thread display: Collapse - | Expand +

All Comments

Collapse -

I agree with the facilities folks on this one. It's much safer to do

by ManiacMan In reply to Data Center scheduled pow ...

considering they will be running 225 Ampere cables. Sounds like the ones against this don't fully comprehend the electrical, fire, and risk of electrocution hazards involved in doing the work in a "hot" scenario. I side with the electricians, as all safety codes must be followed, even if it means having to shut down systems.

Collapse -

Hot work on 225A circuits?

by NickNielsen In reply to Data Center scheduled pow ...

What does your corporate safety officer say? S/He should have the final say. Here is some supporting information:
http://www.physics.ohio-state.edu/~p616/safety/fatal_current.html


While I haven't done N+1 work with power, the concepts transfer easily from communications circuits. Some points you can consider:

- Are there UPS's for the equipment in addition to the PDUs? If so, how long will the UPS's hold the primary equipment on line? Is this time shorter or longer than the PDU restore time?
- How long will it take to get the PDU in work back on line if the other goes down?
- How often do PDU failures occur? External power failures?
- When is your slowest (least load) time? Can the work be scheduled then?
- Is a temporary backup generator feasible?

Hope this helps. I also hope you win your battle. In my experience, only a very heavily-insured fool will work on a live 225A circuit.

Collapse -

Have those "other" members grab hold of a 225A live wire with their hands

by ManiacMan In reply to Hot work on 225A circuits ...

Did someone order a pompous *** manager extra crispy? :^0

If those *** fail to understand what a 225 Amp circuit is capable of, why not give them a live demonstration by sticking their hands inside the master breaker box?

Message was edited by: beth.blakely@...

Collapse -

How to handle Scheduled power maintenance

by greg.couch In reply to Data Center scheduled pow ...

Good question to ask and you are on the right track. Never work on a "Hot"system if it can be avoided. The risk is far to great in both safety and potential downtime due to problems that will inevitably be encountered during the maintenance process. This is exactly the reason for an N+1 power supply system. Those "others" that you refer to in you organization are no doubt desk jockeys that will not have to stay up hours on end repairing a completely offline system ...and their careers when something goes wrong, and it will. Plan for the worst and hope for the best. But remember that hope is for engineers with their fingers crossed, planning and preparing for the worst is for those of us that have to deal with the realities of customer service impacting outages. Listen to those in the office that will later say, "That shouldn't have happened" and your fate will be sealed while theirs will continue on unaffected. Have a good back-out plan, develop a MOP (Method of Procedure)that covers all possibilities. Get "Buy-in" from the engineering, operation and maintenance team managers. Good luck......but luck is for rabits. (former Operations Manager-VZW)

Collapse -

Another option

I agree with everyone about not working on hot circuits, regardless of amp ratings, even if the redundancy is no longer in affect.

If you do not have the luxury of gambling, there is one other option. On one occasion I had a similar type of situation and presented the idea of having a backup emergency generator on standby for the entire maintenance interval. That particular company felt having a generator on site was worth the added expense for peace of mind even though it was not needed.

Collapse -

Data Center Scheduled Power Outages - Additional Comments / Clarification

by rlcallison In reply to Another option

First I want to thank everyone for their support to my original posting. I'd like to clarify a couple of things.

We just went through a complete electrical rennovation project for our Data Center that was intended to provide us with more future gowth for circuit breaker capacity and redundancy. Therefore, ALL of the main equipment is brand new (generators, UPS', PDU's, etc.) This project included dual Generartors each feeding new Eaton Powerware UPS', which in turn were connected to an "A or B" set of floor mounted PDU's and Remote Power Panels. We then used a pair of SquareD brand PowerBus 225 electrical bus bars over the top of our Server Pods so that we have an "A" & "B" power source for each server rack. This part of the design (bus bars) has been in place for probably 3 years. Each bus bar needed to be re-fed from it's previous power source over to the new floor PDU's. Internal to our racks, we have at a minimum, (2) PDU's - one each connected to the "A bus" and the other to the "B bus". By doing this, we can turn off ANYTHING on the "A side" and the "B side" will keep carry the load and keep EVERYTHING running. For scheduling purposes, we refer to our work on anything in the Data Center as a "Scheduled Outage" but in this case, NONE of the server type equipment in the Data Center will phsyically go down - it will just be running with one less power supply during the scheduled planned work. To make this even better - we happen to be the "public utility" providing Gas & Electric to the majority of our state and therefore have the added advantage of having "dual feeds" to this site from seperate power substations on the power grid.

In this whole discussion, the "rub" comes from the fact that we would intentionally turn off (1) server power supply per device in about 35 server racks at a time when the "A or B" bus goes down. This only equates to about 1/4 of our whole Data Center because of the way we seperated our power loads throughout the room.

The only exposure we have as I see it would be if the 2nd power supply in a given server failed while power was already down to the other side.

I hope this further qualifies our design and why I'm surprised that we're getting the push back that we are.

Thanks!

Bob

Collapse -

Tell the naysayers to scrub off

by NickNielsen In reply to Data Center Scheduled Pow ...

From the sound of things, they are worried about the .01% chance.

I'll say it out loud this time: if it's so important to them that the work be done on live circuits, let them go in and do it.

Back to Software Forum
7 total posts (Page 1 of 1)  

Related Discussions

Related Forums