Developer

Improving the code writing process

Writing code is a goal oriented process. Unfortunately, the tools that developers have do not assist them in attaining their goals. The tools are getting better, (as someone who has had to write COBOL in vi, I can attest to that), but they still do not understand just how programmer operate. The development tools themselves are still how oriented, not why oriented. Let us take a look at how the code writing process hinders rather than helps developers.

The documentation is predicated on the user knowing what they are looking for.

This is only improving, because the IDEs have glued ToolTips, AutoComplete, etc. into the editors. Coding now is a process of naming your variable, pressing the period, and then scrolling through the list of methods and properties to find what sounds like it does what you want.

But try starting off from a state in which you do not know what objects you need. In other words, try something you have never done before. You are in deep trouble. Language and API documentation is still dominated by how and not why. It assumes you know what object class (or variable type, or function, or whatever) you need, and then shows you what your options are. This is required information, but it is not very helpful, especially if you are not familiar with that language's terminology (or that library's terminology). It is so easy to not find what you are looking for, if the language has standardized .ToString() for everything, but what you are working with has .ToText() instead. More to the point, there needs to be more documentation like the Perl FAQs: goal oriented documentation.

The Perl FAQs are perfect. There are hundreds of "I am trying to do XYZ" items in there, and code that shows you exactly how to do it. The documentation asks the user, "what is your why and how can I help you accomplish that?" I use the Perl FAQs more than the actual reference most of the time; I already know the language syntax, but there are a lot of whys that I have not tried to do. Indeed, the Perl documentation contains so much usable code in a goal oriented layout that it is possible to write 75% of a program out of them. Just try that with Whatever Language In a Nutshell. I have only seen one programming book laid out in a "let us accomplish something" format as opposed to a "here is how we work with strings, here is how we work with numbers" format.

The tools are too focused on writing code.

I know that this is counter-intuitive. IDEs all about code, right? Well, not really. Writing code is the how. The true why is "creating great software." Writing code is simply an ends to that means. The reality is that too many pieces of software simply stink, not because the internal logic is no good, but because the programmer left out things like error handling, input validation, etc. out of sheer laziness or ignorance. An IDE that lets you try doing an implicit conversion when you have strict on with a strongly typed language is doing you no favors, especially if that block of code is somewhere that it only gets accessed once in a blue moon. A language or IDE that makes input validation "too much hassle to bother with" is not doing anyone any favors.

Here is a great example: too many web applications rely upon a combination of JavaScript and the maximum length specification in a form object to do their validation. Unfortunately, not everyone has JavaScript turned on, and many people use some type of auto complete software to fill out a form. And someone can always link to your application backend without replicating your interface. So no matter how much input validation you do on the client side (not saying you should skip it; users typically prefer getting the error before the form is actually submitted to a server), you still need to do it on the backend. Sadly, the concept of tying the input validation logic on the server side to the input validation on the client side is still pretty rare (ASP.Net with its Validator controls is good, but not great). So you end up with code that either is a hassle for the end user (no JavaScript validation) or vulnerable to all sorts of nasty things to occur (no client side validation), or you are forced to write all of your validation code twice, in two different languages.

This is all a by product of the sheer amount of effort that is needed to write this code. It is not brain work, it is drudge work. A well written program with a large amount of user interaction but little complex logic behind it, in a language with large libraries, can be 25% input validation. Let's be real, most applications are of the form "get data from a data source, display it to the user, allow the user some C/R/U/D functionality, and confirm to the user that the procedure was a success or failure." That is all most programs are. A significant portion of security breaches are caused by failure to validate input. For example, Perl has a known buffer overrun problem with using sprintf. "Everyone knows" that you need to validate user input before passing it to sprintf, to ensure that it will not cause a problem. And either through laziness or ignorance (note how I put "everyone knows" in quotation marks), this does not happen, so you get a web app that can execute arbitrary code. The WME exploit, zlib problems, et al all boil down to a failure to validate input.

Imagine if instead, the IDE (or the language itself), instead of being aimed at providing you with fancy indentation and color coding and what not, actually did this on its own? Perl does this to an extent with variable tainting; it will not let you pass a variable that came from the user with certain functions until you touch it first with other functions. More languages need a mechanism like this. But it is not enough. The idea that user input is always clean needs to be erased from the language and the tools, and replaced with a system that encourages good coding practice, through compiler warnings, and even better yet, handling it for you. Imagine if your language saw you taking the contents of a text input and converting it to an integer input, and had the good sense to automatically check it at the moment of input to ensure that it would convert cleanly? That would be a lot better than it is now; trying the conversion, catching an exception, and throwing an error back. This lets the programmer focus on the why, in this case, getting numeric input from the user.

Program logic is a tree, but source code is linear

This is a problem that I did not even see until very recently. Very few programs are written procedurally at this point. The event driven programming model has taken over, and for good reason. Unfortunately, our entire source code creation process is derived from the procedural days. Look at some source code. What you see is that you have a bunch of units of code, all equal to each other. Even when working with object oriented languages, the tools themselves treat the code writing process as a linear, procedural system. You write an object; within that object, all method are equal within the code. Navigating the code is tricky at best.

Even with an IDE that collapses regions, functions, properties, etc., when the code is expanded, it is still a plain text file. The way we have to write overloads is ridiculous. The whole process itself is still stuck in the procedural world, but we are writing for event driven logic. The tools simply do not understand the idea that some blocks of code are inherently derivatives or reliant upon other blocks of code. Too much code serves the purpose of meta data to the rest of the code (such as comments, error handling, function parameters, and more). It does not have to be like this, but it will require a major shift in thinking, both by the people who create the tools, and the people who use them.

Code writing is too separate from the rest of the process

Right now, the tools for completing a software project are loosely integrated at best. Even with the major tool suites, the tools within the suite are not all best of breed, and the better products just do not integrate well into the suite. For example, it would be pretty painful to write a VB.Net Windows desktop application in anything but Visual Studio. Even a simple ASP.Net application would be a hassle to work with outside of Visual Sudio. Sadly, Visual Studio's graphics tools are crude at best. Its database tools are not so hot either, especially for database servers that do not come from Microsoft. Adobe/Macromedia makes excellent graphics editors. But Photoshop, Illustrator, etc. simply do not acknowledge that Visual Studio exists. So the tools that the person making the graphics is using (Photoshop, Illustrator, Freehand, Flash, and so on) have zero awareness of Visual Studio, and vice versa. The graphics person has to do his work and then pass it to the programmer and the GUI person so they can see how it fits.

Microsoft is trying to address this problem with the upcoming Expression system, but I am not holding my breath. I will believe it when I see it. This creates a problem where the graphics artist does not realize that their vision cannot be implemented within the code. The systems architects have a hard time seeing that their detailed database layout is nearly impossible to turn into a usable interface. The project manager does not get an idea of just what is needed to make the workflow go smoother. And so on and so on.

It is great that the tool makers have brought testing and version control into the process. This helps tremendously. But these tools still are not perfect, and could use a lot of improvement, particularly version control. At this point, version control is still a glorified document check in/checkout system with a hook into a difference engine. It has no awareness of the program itself and it is still very difficult for multiple people to simultaneously work on the same section of code. Even then, as one person makes changes that affect others, the version control system is not doing much to help the team out. I worked at a place that used CVS; the system was so complicated that we barely used it. For what little it did, it was not worth the effort. Version control, even in a single developer environment, is a major pain point. I have some ideas on how to improve this, but this is not the time to discuss them.

The situation is not as bleak as I paint it

I know, I make it look like it is a wonder that programs get written at all. It is not quite so bad as that. But I think that it is time that the tools that we use to create software evolve to meet the why of the code writing process, as opposed to making the how easier. There are a lot of great things about the current tools, and I would not go back to vi and command line debuggers for any amount of money. But I also think that the tools that we have need to make a clean break from their past, and help us out in ways that they simply are not doing at this point in time.

J.Ja

About Justin James

Justin James is the Lead Architect for Conigent.

Editor's Picks

Free Newsletters, In your Inbox