Project Management

Why consultants should banish magical thinking

Chip Camden makes the case that magical thinking, or at least quasi-magical thinking, doesn't benefit consultants or their clients in the long run.

In a post titled Bad Programmers that's still available on the now-defunct Infogami site, the first sign that you might be a bad programmer is "Inability to reason about code." The first two symptoms listed for that deficiency are:

  1. The presence of "voodoo code", or code that has no effect on the goal of the program but is diligently maintained anyway (such as initializing variables that are never used, calling functions that are irrelevant to the goal, producing output that is not used, etc.)
  2. Executing idempotent functions multiple times (eg: calling the save() function multiple times "just to be sure")

These two alternatives to logic further indicate a much more widespread disability for problem-solving: magical thinking, or at least quasi-magical thinking. Perhaps some individuals don't possess a rational belief in the efficacy of these actions, but they perform them anyway. They have an extreme reluctance to change anything, as if the computing gods would visit vengeance upon them for daring to explore the inner sanctum of the system. Yet, the unwillingness to refactor at all leads to some of the worst bugs in the long run.

My oldest son (who recently graduated from Stanford and is now working for a Silicon Valley startup — can you tell I'm proud?) and I had a conversation the other day about this phenomenon. He observed that whenever he's encountered this attitude, it's accompanied by a sense of fear. "What are they afraid of?" he said, "Finding out the truth? If they don't figure out what's going on now, they'll just end up having to figure it out the night before release." That's my boy.

I think part of the fear comes from not wanting to get mired down in details. They know from experience that if they have to fully prove what the code is doing, it could run into hours of debugging time. So instead, they invent a plausible myth as the agent. "Oh, it must be the third-party component doing something weird," "it must be a timing thing," or my favorite, "it just doesn't like that customer." These explanations may be partially correct, but until you prove your assertion experimentally, it's no more than a myth.

With third-party components you may have limited ability to experiment, especially if they're closed-source, but you can still verify inputs and outputs. If the component shares your address space (like a library or a control), then the kinds of bugs they can create may indeed seem like some form of dark magic. In that case, you try to rule out the component by stubbing it.

Timing issues can be some of the most intractable of bugs. Nevertheless, 90% of the time someone suggests this is the source of the problem, they're wrong. It's an easy scapegoat, because it's so hard to prove right or wrong. Nevertheless, instead of just making the claim, try to construct a test that reproduces the same problem at different speeds.

Attributing personality flaws to software or hardware is unlikely to provide an accurate model of what is happening. Nevertheless, it does sometimes seem that certain people evoke a particular failure. You get something fixed, and as you send it to them you know in the depths of your heart that it will not solve the problem, but only for them. However, a rational explanation always surfaces in the end. A former DEC field engineer once told me about a case like this. One of the old PDP-8 systems at a customer site crashed on a daily basis. Every time the engineer went to check it out, it was fine. He tried replacing parts anyway, still no luck. He'd ask the operator to show him what she did, and it still wouldn't fail. No sooner would he leave, than down it would go. Finally, he obtained permission to just sit in the client's office until it failed. When it finally did, and he was able to consistently evoke a reproduction, he had a new problem: how to tell the operator that the crash came from a discharge of static electricity from the pantyhose rubbing together on her rather corpulent legs.

Whether causality is deterministic at all levels of reality may be subject to debate, but the computers that we work with in the IT industry were all designed to provide deterministic results, given known inputs and operating conditions. Outcomes that appear to contradict that model (such as schroedinbugs), invariably turn out to have deterministic explanations. Something in the code, its inputs, or its environment changed in ways that we've missed. But for some reason, people would often rather believe in some inscrutable factor, which they hope to appease by not rocking the boat. We as consultants need to disabuse them of these notions, and make sure we aren't entertaining any of them ourselves.

About

Chip Camden has been programming since 1978, and he's still not done. An independent consultant since 1991, Chip specializes in software development tools, languages, and migration to new technology. Besides writing for TechRepublic's IT Consultant b...

Editor's Picks