Mondays are good
for melodrama. Everyone’s rested up from the weekend and filled with
good intentions. Sometimes people even come up with good ideas after
a few hours away from their keyboards. My own team, for example, had
some excellent points to make about things we need to accomplish.
So, either they came up with them over the weekend or I finally
slowed down enough over the weekend to really listen to them today.
Probably, given the nature of the world, a little bit of both.
It’s the a
little bit of both really struck home with me today. When people
get to moving fast, and pushing hard, we tend to focus in on one or
two things we believe to be important. These key issues become the
sole things we work on, regardless of everything else going on around
us. We work at them until we resolve them, then feel honest surprise
when the overarching problems do not go away.
In reality a lot
of problems (rather than incidents) in IT stem from multiple causes.
Most real, reoccurring problems involve some process pieces, personal
issues, misapplied or incorrectly configured technology, along with
some good old fashioned bad luck we didn’t notice at the time. It
all kind of lurks around in the background until some relatively
random event creates an incident bringing all of the related causes
to the surface. Sorting out what’s really going on, though, can take
a lot of work and a level head.
Right now, for
example, we’ve just finished work on an incident which disrupted
service for almost five days last week. Everyone involved struggled
with the system; we eventually fell back and punted rather than keep
fighting with it. The workaround did stabilize the situation for
now. With luck it will keep things mostly stable while we figure the
rest of it out.
currently rests with the effort we went though. Yes, we did not
communicate as well as we could have. Yes, our teams did not always
share information in an actionable way. It also took us a while to
figure out a good workaround.
My interest, and
where I’ve asked my team to focus in on this week, falls squarely on
the problems the incident exposed. The lack of technical planning,
environmental consistency, and practical methods for troubleshooting
just start the list. Far more fascinating was how the incident
displayed preconceptions and fractures in the lines of
authority/responsibility surrounding the technology in question.
These are long standing architectural and systemic issues no one can
solve quickly. Seeing them in action, though, renewed my commitment
to initiating slow change.
another forum, the same problems played themselves out during our
preparations to restart a project plagued by technical glitches. It
made for interesting listening, even though I mostly stayed out of
it. I’m still questioning that decision, but if after working with
the project manager, my senior on the project, and a half dozen other
folks we couldn’t win the battles that mattered I probably need to
find another job. Fortunately we got what we needed in terms of
scheduling; everything else was a sacrifice we happily made. Tying
my senior up for two more weeks wreaks havoc with my scheduling, but
I want to make this work for a lot of reasons.
That doesn’t mean,
though, that the team isn’t feeling the heat from having the senior
out of action for this long. I’ve pushed an assessment job down onto
one of my intermediates; he’s a good guy and very talented. But its
his first serious assessment and its not something he’s ready for.
That said, he’ll do a hell of a job with some help and coaching. I
just have to somehow get the time to do that coaching while still
making it to all of my meetings.
The same problems
also came to a head over on another team’s plate this afternoon.
Generally that’s something which causes me to offer my sympathy. In
this case, though, it directly impacts all of my projects and my
currently deployed systems. The other team’s work is good but I have
to question whether we’ve lost sight of the core technical problems
they want to address. More importantly, if we have lost sight of
those problems what can I, as an outsider and not a well liked one at
that, do about it?
So, it’s the usual
mix of negotiations and scheduling planned for the week plus a heavy dose of coaching for everyone involved. Including, I
hope, me. I’m going to try to get in touch with one of my mentors to
discuss some issues that came up last week.