I sat down the other day to try and comb though a client's operational data. He had two server monitoring systems, four network monitoring systems, a very nice facility monitoring system, a project management server running the latest in commercially available software, a legacy project system which still didn't connect properly to it, and two separate support ticketing systems. He wanted to create or buy a tool which would gather all of this data into one place and deal with it so the staff could, somehow, act on it all.
Now, on a practical level we could just install any one of a number of additional products to bring all of this information into one place. I'm absolutely sure we could also develop some custom reporting tools which would bind the data together into something at least legible. I'm sure we could save time just by cutting down on the number of critical alerts and making it so people didn't have to go to six separate systems to figure out what is going on.
However, I suggested that before we start to develop or deploy new tools, his team needed to make a fundamental shift in how they approached problems. With such an unsorted mass of data available to them, the team had stopped checking anything other than their own ticketing system. It was a natural adaptation to the insanity. It had to go. Relying entirely on rumors and informal communications about what's going wrong works okay in a ten-man shop, but simply cannot sustain a three-hundred person IT operation.
My suggestion that the team try to look into the other ticketing system (used by telecom, network, and operations) lead to quiet jeering. I'm used to that – people love to tell me that I'm overly fond of tools to do relationships work. What I didn't expect, though, was a flat out statement that it was impossible. How on earth could they check the other team's ticket system when they couldn't get at it though the firewall? According to the team I was working with, the network team had cut off access about a year ago and never restored it.
Curious, I spent a few minutes poking around on the network. Sure enough, the “operational” ticket system was blocked off from the “deployment and support” organization. I don't mean that the team didn't have security access; the core router actually blocked the communication of the network segments. If you were in the deployment and support part of the organization you could not reach the VLAN containing the operations tools.
Everyone assured me the loss of connectivity was intentional, a result of a falling out between two of the director-level management types. I would, I was informed, be walking into an angry hornet's nest if I brought it up. Although I believed my customer, I decided to take a chance anyways. After all, as a consultant, I can occasionally claim stupidity as a defense. I don't know everyone's history, so overstepping my bounds occasionally doesn't always result in serious repercussions.
So, I trooped on out to talk with my contact down in networking. He looked perplexed when I asked about it. After checking out the situation, we went together to the gentleman in charge of the local corporate network. He was equally surprised; either the political storm causing the problem had long since blown over or he really didn't know anything about it. A few minutes of poking revealed an incorrectly configured start-up file; a minute later he fixed it both live and in the offending configuration.
Getting access to the system, naturally, did nothing to actually resolve the cultural or technological challenges shackling the organizations effort to understand its environment. It did, however, drive home to me one of the first rules of consulting taught to me long ago:
“Never attribute to malice what can be caused by miscommunication.”
If I manage to do nothing else for this client, I have opened up a new line of communication between the deployment/support and the operational groups. Hopefully, by next week I'll have a clearer idea of what I'm going to do about pulling together some of this data, even if its just designating a senior to read though print-outs for an hour a day.