Project Management

Why consultants should banish magical thinking

Chip Camden makes the case that magical thinking, or at least quasi-magical thinking, doesn't benefit consultants or their clients in the long run.

In a post titled Bad Programmers that's still available on the now-defunct Infogami site, the first sign that you might be a bad programmer is "Inability to reason about code." The first two symptoms listed for that deficiency are:

  1. The presence of "voodoo code", or code that has no effect on the goal of the program but is diligently maintained anyway (such as initializing variables that are never used, calling functions that are irrelevant to the goal, producing output that is not used, etc.)
  2. Executing idempotent functions multiple times (eg: calling the save() function multiple times "just to be sure")

These two alternatives to logic further indicate a much more widespread disability for problem-solving: magical thinking, or at least quasi-magical thinking. Perhaps some individuals don't possess a rational belief in the efficacy of these actions, but they perform them anyway. They have an extreme reluctance to change anything, as if the computing gods would visit vengeance upon them for daring to explore the inner sanctum of the system. Yet, the unwillingness to refactor at all leads to some of the worst bugs in the long run.

My oldest son (who recently graduated from Stanford and is now working for a Silicon Valley startup -- can you tell I'm proud?) and I had a conversation the other day about this phenomenon. He observed that whenever he's encountered this attitude, it's accompanied by a sense of fear. "What are they afraid of?" he said, "Finding out the truth? If they don't figure out what's going on now, they'll just end up having to figure it out the night before release." That's my boy.

I think part of the fear comes from not wanting to get mired down in details. They know from experience that if they have to fully prove what the code is doing, it could run into hours of debugging time. So instead, they invent a plausible myth as the agent. "Oh, it must be the third-party component doing something weird," "it must be a timing thing," or my favorite, "it just doesn't like that customer." These explanations may be partially correct, but until you prove your assertion experimentally, it's no more than a myth.

With third-party components you may have limited ability to experiment, especially if they're closed-source, but you can still verify inputs and outputs. If the component shares your address space (like a library or a control), then the kinds of bugs they can create may indeed seem like some form of dark magic. In that case, you try to rule out the component by stubbing it.

Timing issues can be some of the most intractable of bugs. Nevertheless, 90% of the time someone suggests this is the source of the problem, they're wrong. It's an easy scapegoat, because it's so hard to prove right or wrong. Nevertheless, instead of just making the claim, try to construct a test that reproduces the same problem at different speeds.

Attributing personality flaws to software or hardware is unlikely to provide an accurate model of what is happening. Nevertheless, it does sometimes seem that certain people evoke a particular failure. You get something fixed, and as you send it to them you know in the depths of your heart that it will not solve the problem, but only for them. However, a rational explanation always surfaces in the end. A former DEC field engineer once told me about a case like this. One of the old PDP-8 systems at a customer site crashed on a daily basis. Every time the engineer went to check it out, it was fine. He tried replacing parts anyway, still no luck. He'd ask the operator to show him what she did, and it still wouldn't fail. No sooner would he leave, than down it would go. Finally, he obtained permission to just sit in the client's office until it failed. When it finally did, and he was able to consistently evoke a reproduction, he had a new problem: how to tell the operator that the crash came from a discharge of static electricity from the pantyhose rubbing together on her rather corpulent legs.

Whether causality is deterministic at all levels of reality may be subject to debate, but the computers that we work with in the IT industry were all designed to provide deterministic results, given known inputs and operating conditions. Outcomes that appear to contradict that model (such as schroedinbugs), invariably turn out to have deterministic explanations. Something in the code, its inputs, or its environment changed in ways that we've missed. But for some reason, people would often rather believe in some inscrutable factor, which they hope to appease by not rocking the boat. We as consultants need to disabuse them of these notions, and make sure we aren't entertaining any of them ourselves.

About

Chip Camden has been programming since 1978, and he's still not done. An independent consultant since 1991, Chip specializes in software development tools, languages, and migration to new technology. Besides writing for TechRepublic's IT Consultant b...

73 comments
prh47
prh47

Fear doesn't have to be irrational. There can be the fear that if you change something, something else will break. In highly modularized code this may be less of a concern, but in highly monolithic code, it definitely needs to be considered. Given enough time you may be able to figure everything out, and be 99.99999% certain that the cows will still give milk, but if you don't have the time, and you're only slightly omniscient (which is the best most programmers can hope for), there is good reason for uncertainty.

registered
registered

To Chip Camden: As you say, "I think part of the fear comes from not wanting to get mired down in details. They know from experience that if they have to fully prove what the code is doing, it could run into hours of debugging time." That's certainly been my experience. Spot on, well said.

mattohare
mattohare

once I get rid of all that rubbish. I shrunk an executable from 12megs to just over 3 megs by getting rid of error handling code (that scared the user more than handling the error), addins that were obsolete by newer platform features, and some other code cleanup. The added bonus, I had less code to go through to add new features.

mdresel
mdresel

I almost got thrown out of graduate school over this one: Senior psychology professor, very proud of his PDP 8, with a HARDWARE random number generator board. He told us of debugging a program that used the last bit of the random number to make a binary switch - '0' - do A, '1' - do B. When debugging his program, he tried 8 times in a row to get a 1, but kept getting 0. I told him (this is where I almost got thrown out) that if he had gotten experimental results that unlikely, he would have already published the paper. Moral: just because it is unlikely, doesn't mean that it didn't happen.

LocoLobo
LocoLobo

I agree in essence with your article. Not a programmer but as a user if I have to click "Save" 3 times instead of once to make it work I will. I don't like it. I tell myself I will look into the process later but right now the project is due. There will always be another project right after that. The only solution I have is to continue chipping away at a problem trying to learn the system by studying its behaviour. What I'm saying is sometimes you need to figure in the cost of analyzing the problem to the end. If you have a klugy solution that works, do you move on or not? There are good arguments for both.

AnsuGisalas
AnsuGisalas

Could it be something so simple as insecurity? Insecurity (not lack of information security - I am talking about the human kind, here) is a major factor in a lot of things. Anyone who's gotten the hang of the dating process will know that insecurity is the mother of all scoreblockers, it's worse than bad breath or acne or any other superficial flaw, because insecurity can be detected from across the room, even in a deafeningly noisy, chaotic venue. But that's just how insecurity affects others, the way it affects oneself is far worse. It makes people make mistakes, it makes them not want to examine their own reasoning, it stops them from learning things, even if they try. So, I have had a smattering of programming experience, a tiny smattering of three or four languages, from Basic to Fortran. I remember trying to do a reasonably simple sentence structure parser, to fit a sentence (preferably any sentence) to a set of ten parameters. When I was done with the basics and was trying to get it to compile, let alone do what it should, was I gonna review it in detail? Hell no. I just picked at the fringes at it, trying to make surface fixes to make it work. Because I was insecure about my basis for writing it, had I checked all the variables, had I planned it out properly, etc. Now, there is a difference of many orders of magnitude to what Chip is describing, and the people he is talking about are far more knowledgeable and experienced than I was (even before I forgot everything I learned), but it sounds like insecurity all the same. And insecurity really doesn't care about amounts of knowledge or degrees of experience... it gnaws at the soul. So maybe these people "just" need to know themselves.

seanferd
seanferd

Magical thinking doesn't help. Again, I want my thumbs-up for articles back.

wizard57m-cnet
wizard57m-cnet

if we have to give up our magic powder, wands, potions, spells and crystal balls? Ever try to explain rational code to some users? They would get the idea that we were really just educated individuals rather than the magical beings from an alternate universe with empathic feelings for computer users of Earth!

Tony Hopkinson
Tony Hopkinson

that some of the bugs I've been asked to fix, can only be explained by small blue aliens from Rigel IV with a penchant for pactical jokes. :D I've seen some weird ones in my time, such as the photosensitive dumb terminal, but in the main, the more arcane the explanation you have to cook up to explain why this is happening, the more likely you've missed something horribly obvious.... The photo sensitve dumb terminal always went wrong at sundown. Swapped damn near everything, still did it, only once and only at sundown. Cable to it was a kilometer, so we didn't want to swap that, tested it to death though, finally got tricians to agree to running a new cable, so they unwrapped the current one from the power supply to the 440v neons that lit up the warehouse when it got f'ing dark... Spiralled round a huge dirty power supply cable for two hundred metres, ffs

Sterling chip Camden
Sterling chip Camden

The more intelligent software becomes, the more bugs look like personality disorders. But we're still a long way from having software as complex as the human mind, and in any case we should be able to reason about software that humans created, no matter how complex it is.

Tony Hopkinson
Tony Hopkinson

magical thinking doesn't resolve uncertainty it disguises it. Not a very good disguise either, big fat bloke with a beard wearing a dress claiming to be Kiera Nightly....

Sterling chip Camden
Sterling chip Camden

I tend to think the real answer is closer to the second hypothesis at the end.

AnsuGisalas
AnsuGisalas

1/512 is pretty everyday stuff. Medicine has to be marked for side effects with smaller probabilities. If it had happened twenty times there'd be something to wonder about.

hippiekarl
hippiekarl

Witness those lining up to slap down their house payment on 'black' when a Roulette wheel's paid 'red' 8 or 9 spins in a row. You can't convince these folks that a 'random number (and color) generator' isn't now suddenly--and certainly--poised to 'seek equilibrium' with the 'universe'! Such a wheel, of course, has the exact same number of opportunities for red (or black) as it did on the preceding spins: 18 out of 38 (5.26% less than '50-50'), and so there's a new 'exact same likelihood of any result' every single spin: 18:38 red, 18:38 black, 1:19 green! Oh, yeah; 'green'(!)...most magical-thinkers are already in denial that their "even-money bet" on red/black is NOT a 50-50 proposition (1/19th of the time, one of the two green numbers ['0' and '00'] will play, winning the house all red and black bets, 5.26% of the time, all day and night!). It's a 'loser mentality' that ascribes memory and 'compensation by inanimates' to such math variables as 'standard deviation', 'probability' (which they rationalize as 'eventual certainty'), and 'short-term fluctuation'. "All bets down, people! Round she goes!"

Sterling chip Camden
Sterling chip Camden

But we still shouldn't invent reasons for them, unless we properly qualify those as untested hypotheses.

gechurch
gechurch

I agree that the real-world can be a bit of a pain, with deadlines and the like. And there will always be a new feature that needs building, whereas fixing code problems like this will never be a priority. To my mind, this is where personal pride comes in. You need to resolve to find out what the problem with the code is and fix it. Maybe not right now, but soon. The problem with leaving the sort of code you mention is that it is that you're probably only seeing the tip of the iceberg. When I've seen bugs like this it normally turned out that the root cause was very different from what I thought. Sometimes the buggy code ran a long time before the bug manifested itself and normally I find that the comments explaining the code where the bug appears are incorrect. This is a dangerous situation and it makes it very difficult to implement new features. It's also bugs like this that tend to break when you move to the next operating system - often the hack to "fix" the code only works by coincidence. Ignoring it won't make the problem go away... much better to deal with these issues piecemeal.

andrew232006
andrew232006

I mostly agree with the article. But I think it is also worth keeping in mind that programmers have to deal with thousands of lines of code that they didn't write and sometimes can't even read. It may not be magic, but sometimes as far as me fixing it, it might as well be.

Tony Hopkinson
Tony Hopkinson

So you click it three times. As a user, what choice do you have anyway? As a programmer a bug with a work around like would scare the crap out of me. the number of things that have and are and will go wrong is a real sphincter loosener. It's precisely the sort of bug that Mr Camden is aiming at. Got to be fixed, and now, cost is irrelevant unless, you are going to bin the application in the next week or so.

Sterling chip Camden
Sterling chip Camden

I think the impostor syndrome enters into it, which is of course a manifestation of insecurity. "I don't want to get more deeply involved with this, because then I'll prove myself to be as incompetent as I suspect I am." Unfortunately, many of them are right. But I do find that some very sharp people who only lack a breadth of experience tend to doubt themselves too much. That probably goes for you and your parser, too.

HAL 9000
HAL 9000

But I do have a couple of Sonic Screwdrivers which I wave over things and they just work after wards. Works every time and I like the looks on some peoples faces when I do it. :D Now lets see if this posts as I haven't been able to post to this thread for a couple of days now. :( Col

Sterling chip Camden
Sterling chip Camden

... it will still sound like magic to most people. You can keep your wand.

completeitpro
completeitpro

The hard part of finding a reason for the bug is explaining it to the users. Trying to explain inefficient SQL queries using non-indexed fields causing timeout errors might be better explained as a "database setting causing inefficient areas to not complete", or something similar.

AnsuGisalas
AnsuGisalas

There's a slight, and I do mean slight, chance that they'll also stop blaming you for making their computer stop working from across town. Ok, who am I kidding, they'll never stop that! :D

Charles Bundy
Charles Bundy

Had a VT100 that would do something similar, except instead of crashing it would start spewing random characters when it got dark. That's what you get when the electricians wind Rx/Tx around fluorescent light ballasts and use the drop ceiling grid as a ground plane...

dogknees
dogknees

There was a completely rational reason for the behaviour. It might not have been obvious, but that doesn't mean it's not logical. I've had a few over the years myself. My favourite was a bubblejet printer that would not print on cheque stationery. It worked fine when we did a test run on plain paper, but as soon as we loaded the cheques, it would fail. It turned out to be the paper sensor in the printer was getting a reflection of the magnetic ink on the cheques and mistaking it for paper running out. Moved the cheque stationery to one side by a centimeter and the problem went away.

AnsuGisalas
AnsuGisalas

I mean, a lay person might do that, but electricians should have a different kind of relationship with the concept of copper coils than the lay public... scary.

HAL 9000
HAL 9000

Some of the things I've seen done make me wonder what the person installing the thing was thinking at the time. Or again today a Computer that has not been touched or altered in any way according to it's user would shut down. Just turn it on walk away and do something and when you return it's off. Turns out that the CEO who has not altered [b]"anything"[/b] had the thing to set to turn off after 5 minutes of inactivity so it must have been one of the last Patches from M$ that did it. :^0 Col

HAL 9000
HAL 9000

After all the Hardware & Software have ganged up and are trying to get me. I know this from old and take steps to prevent it happening now. So when the wind is blowing that way I stand on the right foot with the toes pointed that way and clamp my tongue between my teeth with it hanging out the right side of my mouth. When the wind blows at a different level or directions I act differently. But seriously I can remember some really strange things happening over the years ranging from a Operator typing too fast for the hardware which caused a State Wide Crash, Some really Poor Coding practices which only seemed to affect one operator when she sat a certain way to yesterday when I pulled a new NB out of it's box and found that the HDD was blank. I couldn't fault the drive it tests perfectly and it could be A OK and just someone at the factory has screwed up or it could be that the Platters have a faulty coating on them which is causing the Stored Binary to just disappear over time when it's unused. Either way [b]They is Ganging up on me.[/b] Just not sure what to do from there with the last one but I'll deliver it today and see what if anything happens. Oh and in the first case mentioned here that operator knew exactly what she was doing and if she wanted a break she would break the system but I can Guarantee that She didn't tell the Tech who was sent out to fix the mess. ;) However Chip I do tend to disagree with this bit of you above post [i]and in any case we should be able to reason about software that humans created[/i] That works on the superposition that the person/people creating the software are sane. If they are Insane/Deranged there is no way to attempt to understand the mental processes that went into creating the code or what processes they have instigated to make it do what they wanted. You may find that the [b]"Deranged"[/b] programmers [i]who incidentally where always that way and where never allowed near the public lest they scare them[/i] have developed something so strange that [b]"Normal"[/b] Programmers can not understand their Thought Processes. However that just means that there will be a new position opened [b]"Shrinks"[/b] for Ai's where they talk the computer into repairing it's own code. :^0 Col

Charles Bundy
Charles Bundy

without looking at the switch it is as much "magical explanation" as its actions. A lot of switches don't have lugs connected to the body. I still think there is a lot we don't know about the universe and when we hit the unknown "magic" per Arthur C. Clarke takes over. Case in point was an incident at the local utility board. Engineers started putting in capacitor banks at strategic locations. One happened to be about a half mile from a dairy farm. Cows stopped producing milk, and it all started with the capacitor bank. Farmer measured micro voltage present on the assembly that hooked to the poor cow's teats! The engineers said that it wasn't caused by them, but lo and behold they could produce micro voltage simply by bringing the bank into the circuit. PhD's from Virginia Tech couldn't make sense of it nor TVA. Finally they gave up and relocated the unit and cows were happy. :) Per Occam's Razor sometimes magic is the simplest answer!

Tony Hopkinson
Tony Hopkinson

generate the same set of numbers unless, you intervene.

Sterling chip Camden
Sterling chip Camden

The user is subject to the misbehavior of software over which he/she has no control, whereas for the programmer that's his job. Programmers sometimes face the same thing when they use closed-source tools that misbehave. I still like to see people experiment to narrow down exactly which rituals are required or not. It often leads to a clearer understanding of the problem.

Charles Bundy
Charles Bundy

I'm curled protectively over it right now... :)

hippiekarl
hippiekarl

...and showed him a doll with a pin sticking out of it.

Chilidog67
Chilidog67

You're giving electricians too much credit. I've worked with many who may have mastered electrical wiring but electronics are a foreign concept. One time I had an electrician wire 4 offices with CAT 5 for me. He daisy chained them from one punch down to the next like they were electrical outlets. I won't talk about what they did to my 1800ft 12 strand 50m fiber...it still makes me sad.

dogknees
dogknees

While it may be impossible to reason through the insane programmers thought process, it is certainly possible, though perhaps difficult, to reason through the code itself. No matter how convoluted it might appear it will still execute as written and in a deterministic fashion. The idea that code doesn't execute as written is the "magical" belief we need to stamp out.

AnsuGisalas
AnsuGisalas

resonance? Works for chloroplasts. Probably works for lightning, too... and physics isn't so good at describing it,yet.

apotheon
apotheon

I probably would never have noticed the downvote if you hadn't commented. . . . and on Veteran's Day, too! The nerve!

Tony Hopkinson
Tony Hopkinson

marked him down. Come on fess up... The comment to explain why you did, seems to have disappeared .... :p

apotheon
apotheon

This hypothetical software bug and how people talk about dealing with it seems shockingly reminiscent of recent problems with vanishing comments in TR discussions.

AnsuGisalas
AnsuGisalas

is stick that doll with a soldering iron... ...only problem is, you have to find the right place to stick it, and if you notice the doll is NTSC display only - that is, Never Twice Same Charring.

Tony Hopkinson
Tony Hopkinson

from the first day. IT systems are composed of People, Processes and Kit If you are missing one of them, you are missing a lot.....

dogknees
dogknees

I guess the point is that code doesn't act in a vacuum. You have to analyse the whole system, possibly including the hardware, to find the problem. One of my favourite bits of code was written to work around a hardware fault. I blame the physicists for their "simplifying assumptions".

Tony Hopkinson
Tony Hopkinson

was trying to get an RS232 comms circuit to work. Loopback on the line driver told me there was a short some where. Took the socket to pieces. 4 wires, two terminals... Best of it was they'd said I couldn't do the wiring because I wasn't a qualified 'trician.....

Tony Hopkinson
Tony Hopkinson

and no one's failure, including their own, then? You might want to take a few remedial leadership courses...

HAL 9000
HAL 9000

In that case you took an already working program loaded it to new hardware and it turned very expensive race motors to scrap metal. It's a case where a previously working app didn't work when it was loaded to new hardware which it was supposed to be compatible with. There is a very big difference between writing a program for specific hardware which doesn't happen any more anyway to writing an app for a Platform and it works great for a few years and then starts giving Major Problems. The App in question was used to make the programing for EPROMS used in Engine Management Systems in the racing side of the industry. Compile the code through this app and then burn a PROM type thing. All old hat now but then it was right at the Bleeding Edge and a nasty thing to fix. ;) Col

Sterling chip Camden
Sterling chip Camden

It's the fault of some self-styled Agile programmers. Every methodology has its perverters, and there's a big difference between "agile" and "falling apart at the seams".

dogknees
dogknees

The hardware acted in a fixed way. Once you wrote for the hardware rather than the documentation, it worked.

pkesel
pkesel

So many Agile programmers consider working code the only necessary component to successful development. Without knowing the decisions made to bring code to its current state and the reasons those decisions were made, the next step a programmer makes based on reading code alone may undo months worth of decision making. This is where the architect steps in. The architect is responsible for the success of every developer on the team and the continuity of the solution.

AnsuGisalas
AnsuGisalas

Now days I get to fix bugs in code so I'm constantly looking over others code when it starts to give problems and most of those times it's almost impossible to the original problem reliably.

hippiekarl
hippiekarl

I just found another possible bug regarding links to blogs without comments attached, and 'discussions' without their articles attached.

HAL 9000
HAL 9000

Why can you post the entire thing in 3 parts but not 1? It's got to be a Combination of words but I'm dammed if I understand. :D Col

hippiekarl
hippiekarl

...the original problem reliably." So, looking for a 'flaggable' word somewhere, it seemed feasable that "...almost impossible to 'un-something' the original problem reliably." both served the 3-part comment's meaning and could've used a euphamism for 'unmucked' (ahem). But you wouldn't have, with ladies about, I'm sure.

HAL 9000
HAL 9000

So if you read it that way it sort of makes sense. But if you put breaks in it it doesn't make any sense. :D Col

hippiekarl
hippiekarl

the possible culprit? It appears that it wouldv'e been 'un-flock' (actually a certain vowel replacing 'lo'). It makes the comment make sense, and would (I suppose) get it flagged, as well.

HAL 9000
HAL 9000

Anyone got any ideas what in G to G2 is unacceptable to TR as that is what has been causing the post to disappear. I mist have completely lost it as I can not see anything wrong there. Col

HAL 9000
HAL 9000

Of course I may be completely Insane as I've been doing this type of work for several decades now so I may have lost it all ages ago. :^0 Col

HAL 9000
HAL 9000

the original problem reliably. ;)

HAL 9000
HAL 9000

when it starts to give problems and most of those times it's almost impossible to

HAL 9000
HAL 9000

Now days I get to fix bugs in code so I'm constantly looking over others code

HAL 9000
HAL 9000

Not so much a problem with 1 person making modifications over a year or two but when several different people make minor modifications over a decade or so things get into the [I]Magical[/i] category that it works at all let alone works was well as it does.

HAL 9000
HAL 9000

In the end I gave up trying to understand once I came to the conclusion that the majority of people coding did not look at the entire system just small subsections and they would [i]"Develop"[/i] improvements into their Specialist Sections which by themselves looked good but when looked at as part of an entire system left people wondering why it was done that way.

HAL 9000
HAL 9000

But Honestly before I stopped punching code my biggest problem was attempting to understand what was going on on something that had been developed and then [i]"Improved"[/i] numerous times over the years by different programmers/developers and why what was happening was even designed to work that way.

HAL 9000
HAL 9000

And what the programmer was hoping to achieve is also part of understanding the code. If you can not understand why it was written to perform that way it makes it much harder to impossible to understand what was envisaged by it's original developer.

HAL 9000
HAL 9000

It doesn't always apply. I remember the first Pentium???s years ago with a Floating Point Issue which was interesting to say the least and the Code didn't behave as written on that though it worked perfectly on nonpentium CPU's.

HAL 9000
HAL 9000

Not quite right there though that is what should happen