There comes a time for every developer when our brilliant creations have to be turned loose into the world to succeed marvelously (or fail miserably) on their own. Pushes, deployments, rollouts, updates—whatever you call them—they are moments of truth for developers of all stripes, representing the culmination of weeks or sometimes months of work and testing. To find out how the development staff of TechRepublic.com, another site in the CNET Networks family, handles these events, I sat down with Greg Gorman, their director of application development.
What I found out surprised me: “Deploys,” as Greg refers to code rollouts, occur weekly. This seemed a prescription for madness to me, given that monthly updates seemed to come too frequently for comfort in my previous life as a full-time developer. Read on to find out how Greg and his team manage deployments in Internet time.
Setting a schedule
Builder: So the Web site update schedule at TechRepublic is really once a week?
Greg: Yes, roughly every Thursday we deploy changes to the TechRepublic Web site. When we have large changes like a Dynamo upgrade or a new version release, we usually do that over a weekend, but those only occur every three months or so and sometimes longer.
Builder: Has it always been a weekly event?
Greg: No. Originally it was just ad hoc. Whenever we had something, we’d try and push it out. It’s probably been close to a year since we’ve gone to once a week.
Builder: Were there any challenges you faced when implementing the weekly rollout?
Greg: I guess just setting expectations. We still actually do deploys in between the weekly ones if there’s something broken or seriously wrong or [there's] a page that has a glaring problem with it. We’ll kind of fast track that through and get it there as soon as we can.
Builder: When you made the switch to a weekly schedule, did you notice any change in your error rates or bugs?
Greg: I would say they’ve gone down. There’s now less rework on the live site because the general sense, on the ad hoc schedule, was, “Well, it’s done now. It really needs to get out to live,” and sometimes I think there was a sense of urgency, and we said, “Don’t spend a lot of time testing. Let’s get this thing out.” Whereas now we have a more formal process, and I think that tends to give us the ability to stop and think, “Well, what else might this affect?”
Builder: When doing larger updates, like rolling out on a new version of Dynamo, how is the deploy process affected?
Greg: When doing a large effort, such as the Dynamo 5 conversion or a new version of the site, we typically put a deploy freeze in effect about four to six weeks beforehand to allow our Quality Assurance (QA) team to focus on testing the new stuff. Emergencies will still go out during that time, but as much as possible, we do what we can to keep the QA team focused on testing the big new stuff.
I really need that finished today…
Builder: I know that in every organization, there’s one group that always wants to bend the rules and get their changes rushed through, regardless of the schedule. Which group tends to push the hardest for rushed enhancements at Builder?
Greg: I’d say the sales group, but that’s just part of the business. Right now the advertisers are holding the keys, and so we jump through whatever hoops we have to, within reason, without jeopardizing site availability and uptime to try and make them happy.
Builder: How far ahead do the developers work?
Greg: The official process—we don’t always hold to it—is that we get all of the development work completed by the Friday before Thursday’s deploy to give us time for reviews.
Builder: Can you walk me through the review process?
Greg: Sure. There are several sets of reviews that go on. There’s a code review that’s internal to the development team. If there are any changes relating to database stored procedures, data tables, or any of that stuff, then the data services group needs to review those and give their blessing. Then, the official deadline of having everything, including data services review, complete so that it’s ready to test is Monday at noon. That gives our QA team a chance to go through and QA it on the staging boxes.
Builder: The staging boxes?
Greg:The sets of servers we deploy code onto before sending it to the live site. Devstaging is used as sort of an integration server and uses a test database. Source code gets deployed from SourceSafe to Devstaging and then checked out by QA. On the day of the "live deploy," we move things to Qastaging, which uses the live database. Everything gets one more check on Qastaging, and then it gets deployed to the live site.
Builder: How would you characterize your internal code reviews?
Greg: Code reviews are pretty informal here. We have a small checklist of things we look for, but it is probably one of our weaker steps in the software development life cycle. It is required that a code review be completed, or marked as not applicable, before QA will deploy to Devstaging.
Creating a testing plan and dealing with layoffs
Builder: What steps are there in your QA process?
Greg: Well, we create a test plan for each major thing that’s being deployed. If it’s just copy change or something simple, then we don’t create a test plan. If we’re adding a new feature or enhancing something, then we look at the spec to determine what changes are being made and what should happen and then create a test plan. So if it’s, say, a new enrollment process, they’ll walk through it with some test data and then actually go out to the database and see if it inserted the rows that were supposed to be inserted in the table.
Builder: Do the QA people draw up the test plan or do the developers?
Greg: It varies. If I had to just guess, I’d probably say 75 percent QA, 25 percent developer. I’m trying to get the developers more involved.
Greg: QA has been more heavily involved [in creating the test plans] in the past, but there are some things that are going to push us to have developers more involved, with recent staffing cutbacks and such.
Builder: Since you mentioned it, how did the change in staffing levels affect the process?
Greg: Well, we have very restricted resources now in QA. And the Project Management staff has been sort of redeployed in a sense: They’re really not going to be writing specs at all. In the past, QA could look at the [Project Manager’s] specs and write their own full-blown test plan when one was needed. Now it’s going to be up to the developers to communicate with the business people that are making the request and write their own specs. But [those specs] probably won’t be as detailed, so they’re going to have to be involved more closely with QA to make sure that QA knows what should be happening and what they should be testing.
Builder: What about the recent acquisition by CNET Networks?
Greg: We’re in a transitional phase now. CNET works in a different way than TechRepublic did. It’s not necessarily better or worse, it’s just different. So we’re going to have to respond to that. I think it will push more responsibility to the development team.
Builder: How would the QA steps change if your release schedule changed?
Greg: Actually, we’re looking at trying to make it a little bit longer, maybe every two weeks [to give us more time]. Again, with limited QA resources, it’s not just the testing that takes time, it’s the actual physical deployment of files, the coordination with the operations team and with data services to make sure that the right things get deployed at the right time for the live site. Whether you’re deploying one quick fix or a whole bunch of stuff, there are a certain number of steps that you’re going to have to run through. So if we can group those together and do it every two weeks, then we think that we’ll be able to gain a little economy and, hopefully, free up our QA team to script of some of their tests.
Builder: You mean do more automated testing than you do now?
Greg: Yes. Most of the functional type testing we’ve done [up until now] has been more manual. For load testing, we have a product from Mercury Interactive called Load Runner—not to be confused with Lode Runner, which was a video game. We’ve used the Load Runner tests quite a bit, and it’s a very good tool for load testing. We have another tool from Mercury Interactive called Wind Runner that allows you to create scripts that will step through your code and do some testing.