Mondays are good

for melodrama. Everyone’s rested up from the weekend and filled with

good intentions. Sometimes people even come up with good ideas after

a few hours away from their keyboards. My own team, for example, had

some excellent points to make about things we need to accomplish.

So, either they came up with them over the weekend or I finally

slowed down enough over the weekend to really listen to them today.

Probably, given the nature of the world, a little bit of both.

It’s the “a

little bit of both” really struck home with me today. When people

get to moving fast, and pushing hard, we tend to focus in on one or

two things we believe to be important. These key issues become the

sole things we work on, regardless of everything else going on around

us. We work at them until we resolve them, then feel honest surprise

when the overarching problems do not go away.

In reality a lot

of problems (rather than incidents) in IT stem from multiple causes.

Most real, reoccurring problems involve some process pieces, personal

issues, misapplied or incorrectly configured technology, along with

some good old fashioned bad luck we didn’t notice at the time. It

all kind of lurks around in the background until some relatively

random event creates an incident bringing all of the related causes

to the surface. Sorting out what’s really going on, though, can take

a lot of work and a level head.

Right now, for

example, we’ve just finished work on an incident which disrupted

service for almost five days last week. Everyone involved struggled

with the system; we eventually fell back and punted rather than keep

fighting with it. The workaround did stabilize the situation for

now. With luck it will keep things mostly stable while we figure the

rest of it out.

Everyone’s focus

currently rests with the effort we went though. Yes, we did not

communicate as well as we could have. Yes, our teams did not always

share information in an actionable way. It also took us a while to

figure out a good workaround.

My interest, and

where I’ve asked my team to focus in on this week, falls squarely on

the problems the incident exposed. The lack of technical planning,

environmental consistency, and practical methods for troubleshooting

just start the list. Far more fascinating was how the incident

displayed preconceptions and fractures in the lines of

authority/responsibility surrounding the technology in question.

These are long standing architectural and systemic issues no one can

solve quickly. Seeing them in action, though, renewed my commitment

to initiating slow change.

Meanwhile, in

another forum, the same problems played themselves out during our

preparations to restart a project plagued by technical glitches. It

made for interesting listening, even though I mostly stayed out of

it. I’m still questioning that decision, but if after working with

the project manager, my senior on the project, and a half dozen other

folks we couldn’t win the battles that mattered I probably need to

find another job. Fortunately we got what we needed in terms of

scheduling; everything else was a sacrifice we happily made. Tying

my senior up for two more weeks wreaks havoc with my scheduling, but

I want to make this work for a lot of reasons.

That doesn’t mean,

though, that the team isn’t feeling the heat from having the senior

out of action for this long. I’ve pushed an assessment job down onto

one of my intermediates; he’s a good guy and very talented. But its

his first serious assessment and its not something he’s ready for.

That said, he’ll do a hell of a job with some help and coaching. I

just have to somehow get the time to do that coaching while still

making it to all of my meetings.

The same problems

also came to a head over on another team’s plate this afternoon.

Generally that’s something which causes me to offer my sympathy. In

this case, though, it directly impacts all of my projects and my

currently deployed systems. The other team’s work is good but I have

to question whether we’ve lost sight of the core technical problems

they want to address. More importantly, if we have lost sight of

those problems what can I, as an outsider and not a well liked one at

that, do about it?

So, it’s the usual

mix of negotiations and scheduling planned for the week plus a heavy dose of coaching for everyone involved. Including, I

hope, me. I’m going to try to get in touch with one of my mentors to

discuss some issues that came up last week.