I am sure that most conscientious developers do their best to make sure that the next coder who tries to maintain the program has an easy time. But how many of us pay any attention to the system engineers who need to actually deploy these things? The answer is, quite unfortunately, not enough of us.
UNIX is a great example of what I am talking about. Up until very, very recently, UNIX was known to be a system that only a programmer could like. Being a UNIX systems administrator involved knowledge of things like makefiles, building kernels, applying code patches, and so on. As a result, UNIX got a bad reputation that continues long after the situation has been changed.
Likewise, many large “enterprise class applications” are so complex to install and configure that companies will spend as much (or even more!) to hire outside experts to install it as they did on the original software itself. Now I do not know about you, but I think that if installing Microsoft Office cost $500 or installing Photoshop cost $600 in outside help, these applications would not be used too much. Even when you go to a PC shop, they rarely charge for more than an hour’s worth of time to install Windows. The current state of application installation (including *Nix and Mac) is an example of the fact that it does not need to be too tough. Most desktop apps, regardless of platform, use that platform’s standard installation system. Once the user learns that system that is it.
And then we get to the custom coded solutions that most developers are working on. What a mess! At best, there might be some .aspx or .jsp or .php (or whatever) files to copy over. At the medium end, there might be some compiled binaries or bytecode to move and possibly an application server restart. And then we get the really hairy ones, the kind with custom DB scripts that may or may not overwrite existing configuration data, the ones that require a full reboot, the ones that involve stopping and then starting the database server, and so on. You know the kind that I mean: downtime, possible failure, and lengthy recovery times.
On the desktop side, anything involving Oracle is a guaranteed mess when it needs the Oracle client. Java desktop apps are another pain point, particularly when they need a non-standard JVM. There is nothing that will aggravate desktop support quite like juggling four apps that need three different JVMs. And, of course, the users love installers or patches that “require” a reboot that really is not needed.
Why are we will stuck in this situation? Sure, some of the systems have improved, like application servers that maintain session state throughout a restart or are nice enough to reload bytecode if the file changes. On the other hand, redeploying a Java EAR file or running an MSI installer to update the application is a lot more time consuming and occasionally painful than running “cp –R /path/to/staging/* /parth/to/production”.
When the deployment fails, our only real option is the stop the application server, stop the database server, restore the database, rollback the files, and fire everything up again. This can take hours, during which customers are furious. At least in a bigger system, there is some load balancing or clustering going on, so the system is not hard down. But still, no one likes the situation at all.
I am really not sure what the solution is. I know one major item is to fully and completely separate configuration from runtime data. That takes away a lot of the potential damage and difficulty of both the deployment and the rollback (even in a clustered scenario, if you deployed bad metadata or configuration to the central storage area, all nodes are bad). Another item is to make sure that the deployment is as easy as possible for the person doing it and throws off clear, concise error messages, and does not commit anything unless all operations are 100% complete. After all, it might not be the rock star sys admin or coder doing the move, but a summer intern who has an escalation path in case he notices an error. Finally, make sure that as many scripts as possible pull from the system’s environment and not your test box assumptions. Remember, your backend Ubuntu machine may very well have a different directory hierarchy from the Red Hat or Solaris machine that the system guys are deploying it to.
There is a lot more work to be done until we all have easy-to-deploy and patch systems, but until then, let’s do our best to keep the server folks happy.