Printers

How the Help Desk Failed the Enterprise during Disaster Recovery


I've recently been consulting for a company that had a minor disaster. At least the disaster was minor in that the corporate building was rendered unusable by an Act of God, but the server room managed to survive without a single drive being lost to heat or uncontrolled shutdown.

I had read the company's disaster recovery plan, such as it was -- a copy and paste off a well known Internet template -- and I knew there'd be problems if anything ever happened. Then lightning struck, literally, late on a Friday night.

In spite of the fact that this $1+ billion company had no workable, tested recovery plan, they managed to restore network connectivity to loaner PCs at an alternate work site and for all intents and purposes, services were restored by Monday morning.

So where the help desk fail the enterprise? Let me enumerate the ways: 

 (1) No printers mapped. There were printers aplenty in the alternate work space, but when they imaged the loaner machines, the support techs didn't bother to install a single printer.  Furthermore, none of the printers that were available were labeled with IP addresses or names, so the only way you could look one up was by make and model.  While veteran IT people weren't bothered by such trivial matters, the "typical" end users were lost.  The "add printer" wizard might as well have a Russian-language interface.  Eventually one of the tech guys came around with a 3.5" floppy disk and ran a script that installed printers on PCs in the various work areas.  It was a little late, a little lame, but eventually users were able to print. Lesson learned:  If you can get PCs installed and networked, take five more minutes per machine and install the closest printer.

(2) No phones or voicemail.  There were phones in the alternate work site, but the ratio was one phone to five or six employees, none of the original extensions worked, and no one knew what the new extensions were.  It was every bit of two business days before someone got around to compiling a list of where people were sitting and communicating the list out.  Meanwhile, third parties calling the old direct-dial numbers were getting fast busy signals or an outgoing message stating that the number was no longer in service.  Lesson learned:  Make restoration of the phone system a recovery priority. If you can't restore full functionality of the phone system, at least work with the phone company to reroute incoming calls as quickly as possible. 

(3) Minimal communication.  The managers of various teams tried their best to communicate facts about system status and when core applications and shared drives would become available. The problem was the line of business managers weren't getting timely reports from the IT people managing the recovery.  As a result, people were making up wild stories based on rumors about the state of the system.  That lack of communication didn't instill confidence in the end users that IT knew what it was doing.  Lesson learned:  No matter how busy you are rebuilding servers or restoring connectivity, someone in the IT support organization has to be the "point person" for communication with the lines of business.

If you're lucky enough to work for organizations that know the value of a good business continuity (disaster recovery) plan, you may be chuckling at the lessons learned I've presented as "lessons that should be obvious."  For those of you who aren't so lucky, take note. You never know when lightning is going to strike, but you should assume that it will, eventually.

 

10 comments
webmaster6
webmaster6

I thought the main reason of failure to Help Desk Failed the Enterprise during Disaster Recovery. if you solve the above problem then everything would be right in the future. Thank you Damon Thomas http://www.windowsdatarecovery.com

Bobwon
Bobwon

A good DR plan includes tests where you may have mock disasters. By performing these tests you will find places where you may have a hole in your plans. This is a case where you would have found out the issues with the printers.

GSG
GSG

We recently had a disaster, nothing so drastic as a lost building, but we lost phone system, the core (and thus the whole network), and many of our older servers blew power supplies, controllers, etc... Our first task was to communicate. We appeared in person a each dept and told them what was going on, and handed them their walkie talkies as part of the DR plan. Second, we got the PBX on line and got the phones running, then the core, etc... The one thing that we got most compliments on was our communication. Even after phones were up, we still rounded in person and gave updates frequently. All in all, it was very smooth, and we learned one key thing. Keep the after hours contact numbers on paper, not just online! We now have certain key bits of info on paper in several locations.

gregoirema
gregoirema

george_ou wrote: I created a webpage with simple install scripts for the printers. I did the same thing where I worked contracting with the government. It worked great. I didn't have access to a true web server but was able to use the PWS on XP pro from my own workstation. When we did a Disaster recovery exercise it was a scramble to get my desktop over to the new place so people coulde connect to the printers. It made it easier for the users.

georgeou
georgeou

All the users were told to go to printers.mycompany.com and they figured out how to attach their own printers in 1 minute. All they had to do is click on the location, find the printer nearest them, click on the install script and they were good to go. We never had to label IPs on the printers, we just have it a friendly name like HP-12 and the user just needs to find HP-12 in the printer help page and click on it. This solution scales well globally and it's shockingly simple for the user. I think I'm going to have to create a download template for this help page.

NOW LEFT TR
NOW LEFT TR

None of the the failures were caused by a helpdesk. A service desk perhaps, but not a helpdesk. For one so keen to learn lessons, get the termanology correct. A Helpdesk would not have been to blame. Check your ITIL.

cris.e
cris.e

We found a good time for this was during the end of year freeze. It sounds like a bad time, but it is really pretty slick. I think we went Dec 18-20 or so and it left time for everyone to still enjoy the holidays. We were setting everything up on a parallel infrastructure so prod systems weren't affected, and we didn't have to worry about accommodating any changes since the whole enterprise was locked down for the freeze. (We happen to have our disaster recovery site across town, so there was no travel involved for folks. We do our mainframe recovery far, far from here so that gets tested in summer.)

pc21geek
pc21geek

George, I would be interested in that template/script. Please do post it!! Regards, Kevin

richardw
richardw

Lets get off the ITIL trip, whether it's a Helpdesk or Service desk who cares... Its "terminology" btw..

Editor's Picks