Apps

How developers can survive a 'go live' scenario

Something always goes wrong in a "go live" scenario. Prepare your development team, your work environment, and yourself to handle these high-pressure situations.

The difference between a simple deployment and an all-out "go live" is the project's sheer scope and scale. A deployment is run of the mill, while a go live is the kind of major change that takes months or years to put together and represents either the initial deployment for production use or a major upgrade. These are some of the lessons I learned over the years to get through a go live in one piece.

Create a game plan

You need to have a well-defined plan in advance. Here is a checklist of things you need to know before you can go live:

  • What are the exact steps to perform any needed data migrations or server changes?
  • How do we perform the deployment? (Make a checklist and give every team member a printed copy of it. Cross things off as you do them, noting who did them and what date and time they were done.)
  • How do you back up the application and data pre-deployment?
  • If something goes catastrophically wrong, what is the procedure to restore the original version or restart the deployment?
  • How do you reach the IT or support/help departments of all involved parties?
  • When do you reach your go/no-go point (i.e., the point when you either commit to finishing the deployment or start to roll back the changes), and what are your criteria?
  • What are your post-deployment triage priorities? (This one is especially important, because there will likely be a lot of "squeaky wheels" begging for grease, and you need to know which ones to prioritize first.)
  • How do you take things offline and bring them back online if needed?
  • How do you contact your customers to let them know things are okay or not okay?

You'll notice a lot of these items are not about "how do we do this?" but "what do we do if something goes wrong?" There's a reason for that, which is that things can and will go wrong, and the quality of a go live depends just as much on what went right as how you handled the things that went wrong.

Prepare physically and emotionally

Before heading into the last stretch before a go live, everyone needs a lot of sleep and rest because when things go wrong, you'll work long hours to sort them out. A team of folks already at their breaking point will not work well together, will have a hard time executing the basics, and will fall down flat when challenges arise.

For employees with spouses, kids, etc., you need to get them on board too. There is no way I could have accomplished what I did on any of my go live adventures if my wife was calling me constantly begging me to come home or laying on a thick guilt trip about not seeing the kids.

Prepare your home office

During a go-live, there is a good chance you will have to work for long hours at a stretch or at odd hours of the night. Many times, the work cannot wait for you to get into the office.

Unlike me, most developers work only a few hours a week from home, if at all, so they are not well-equipped to do it. Make sure your home office setup is ready to do real work, which means a proper keyboard, mouse, and monitors, as well as a squared away Internet connection. I also make sure that I have enough music and a good speaker or headphone setup so I can drown out the household noises.

Be flexible with your team

Your team will likely work long, hard hours during the go live. It doesn't help anyone if you are a hard case about things like getting to the office at 9:00 AM, shaving, dressing up, etc. unless those things are absolutely necessary (such as on customer meeting days). For the go live I just went through, I often worked until 4:00 AM, 5:00 AM, and even 6:00 AM a couple of times. If I had forced myself to go into the office at 9:00 AM, I would have been too tired to be effective for the entire day. Instead, I let myself get the extra sleep, and then only went into the office if it made sense.

Also, if folks are stuck at the office late, buy them dinner.

Hope for the best, prepare for the worst

NASA builds triple redundancy into projects because of the fact that things will go wrong, no matter how much you plan. Unless you are working for a big enterprise, you probably can't do that. You have a Development environment, a Testing/QA/Staging environment or two, and a Production environment. The best thing we did for our latest go live was import the data from Production into our Testing environment about a week in advance, which let us catch a lot of issues early and test our data migration with the most realistic data set we could.

Remember you're all on the same side

As the pressure mounts during the go live, things often get tense. It is easy for folks to start snapping at and working against each other instead of with each other. Finger-pointing will not solve the issues.

You absolutely must not allow the pressure, lack of sleep, or other issues interfere with how you work with your teammates. When you are through it all, you'll be glad you kept your cool.

J.Ja

Keep your engineering skills up to date by signing up for TechRepublic's free Software Engineer newsletter, delivered each Tuesday.

About

Justin James is the Lead Architect for Conigent.

9 comments
Refurbished
Refurbished

I worked on several pre-Agile go-live implementations. Although we usually worked normal business hours, for go-lives we would set up shifts for 24/7 coverage for the first few days. It made things a lot easier.

sysop-dr
sysop-dr

Make your plan, practice your plan, then after everyone has had 2 days off and one day back on to re-familiarize, jump in, both feet.

bjmoore
bjmoore

I'm continually surprised by the reluctance to have fallback / contingency plans and the howls of outrage incurred when I announce that various people have the power to declare a 'halt' to a deployment. Despite solid planning - things can and do go wrong at times. I'd much rather implement a fallback that we've had some time to discuss the ring out and consider the implications of it than trying to come up with plan on the fly and hoping we have it right. The most laughable objection I hear is "if you've planned this correctly, you don't need a contingency plan." My view is if you haven't covered the contingencies, you haven't completed your planning. Half a plan is just wishful thinking.

Tony Hopkinson
Tony Hopkinson

while obsessing on all the costs and the risks of the known problems, don't forget the unknown one. Four hours sleep, four days in a row, because of one little omission, The order transactions were sent to another system. If the first time you look through all the interactions of you new roll out you can't find a problem, get someone else with a different perspective to look, it's there, right under your nose probably.

tech_sud
tech_sud

Hi James, I thought that go live is outdated terminology. Because, Agile projects now are going live from day one. And everyday you are adding a feature it is live so that you will get instant feedback. Thanks, Ahmed.

Justin James
Justin James

Lack of sleep becomes a *huge* risk. For my last go live, we spent a LOT of time looking over each other's shoulders because of it! In fact, other than a particular situation not suited for public discussion, most of our big issues were things that were done with only one person looking at it, in the tail end of the final push when sleep was at a premium. J.Ja

Justin James
Justin James

Ahmed - My current project is pretty agile, we deploy features and fixes at *least* weekly. But that still didn't stop us from rolling out a massive new upgrade that took roughly 3 months to develop. You simply cannot build "big functionality" and deploy it weekly, especially when a data migration of millions of records are involved. You *might* pull it off if the data can be easily extended, but when it is very different, it just isn't going to work that way. Look at, say, Gmail. Yes, they develop that in an Agile fashion. But I guarantee that they put a lot of effort into it before it had an initial "go live"! No one just throws a data schema and a rough "list"/"edit"/"delete" application on line and starts iterating from there. That's not "agile", that's "dumb". J.Ja

Tony Hopkinson
Tony Hopkinson

for it to be in anyway reliable. If the system being rolled out is a replacement, your new feature is doing something you got wrong, right or doing something you forgot, and half a replacement is often worse than a complete failure. All depends on the system but imagine for instance you discover your transactions are mis-ordered when summing their impact them in a periodic report. Instant feedback is instant on recognition of an issue, not on it's creation.

Tony Hopkinson
Tony Hopkinson

24/7 manufacturing system. Patched and bodged to keep the system stuttering along. Second order of business (first was 36 hours of sleep) was to revisit everything I'd done, and do it properly. :(

Editor's Picks