Apps

Tracking down a tricky null reference bug

Justin James offers details and lessons learned about a frustrating null reference bug that took him awhile to figure out without a debugger.

I recently found myself in one of those troubleshooting scenarios where you just can't use a proper debugger. Along the way, I discovered a bug that I would have solved in a couple of minutes with a debugger, but without one, it had me stumped for several days. It was an interesting little bug that was very frustrating to track down. I hope that by sharing this information you won't get stuck on it as long as I did if you encounter it.

Here's a sample piece of code:

var x = "t";
try

{

var y = int.Parse(x);

}

catch (Exception ex)

{

Console.WriteLine("Exception: " + ex.Message);

Console.WriteLine("Exception Inner Message: " + ex.InnerException.Message);

}

You would expect the following to happen:

  1. Variable x is created as a string, with a value of "t".
  2. Enter a try block.
  3. int.Parse throws an exception, because "t" cannot be parsed as an integer.
  4. The exception's message property is printed to the screen.
  5. The inner exception's message property is printed to the screen.

The reality is step #5 never happens; instead, the application completely bombs out, with this exception message: "Object reference not set to an instance of an object." While the problem is easily caught in an IDE debugging session, in a scenario where you can't get a debugger attached (and we know those seem to happen all too often for a variety of reasons), we are pretty much out of luck. What's going on?

We are seeing a null reference occurring within the catch block itself, because there is no guarantee that ex.InnerMessage is not null. And because of where the exception is happening, a lot of our usual methods of tracking the issue down are not available to us (event viewer, for example, seems to log a lot less information than you'd normally get). Even in a debugger, we are at a disadvantage; there is no Exception object on the stack for us to find -- as soon as we clear the Exception notice on the screen, there are no details of this exception anywhere. If you inspect the Exception object in the catch block, it shows the integer parsing issue, not this null reference item.

I learned two lessons here. First, you need to be very careful within catch blocks; exceptions within them will be hard to track down, especially if you cannot get a debugger attached. Second, you need to be aware that Exception.InnerMessage is an object reference and needs to be treated as such. I know this is pretty obvious, but if you are used to only using a couple of Exception's properties in code like I am (I inspect InnerMessage all of the time in the debugger, but I never used it in the code itself before), this can bite you pretty hard.

J.Ja

About

Justin James is the Lead Architect for Conigent.

21 comments
freeman.sm
freeman.sm

ex.printStackTrace() would have given you all of the information you needed to debug the issue. The problem here is that ex.getMessage() or ex.getLocalMessage() can be null and you need to allow for that in all cases. It all comes down to the code throwing the Exception does not have to set the message content. Remember that are Exception constructors which do not include a message.

Tony Hopkinson
Tony Hopkinson

Ex.StackTrace will serve you better, as the inner exception chain is theoretically infinite. Also if you use Fx Cop, it would have jibbed at that code and at least forced a null check...

Sterling chip Camden
Sterling chip Camden

They introduce unusual code paths and weird zones in your code. A monad is a much better model. Unfortunately, that could take quite a bit of work to implement in C#.

Justin James
Justin James

I had already tried StackTrace, and I didn't get what I was looking for from it. :( J.Ja

Justin James
Justin James

... and it wasn't giving me what I wanted. Good point on StyleCop. Interesting, ReSharper didn't pick it up as far as I can tell... J.Ja

jkameleon
jkameleon

Just let the bloody thing write its shit to the event log by itself. System exceptions (divide by zero, null reference, etc) should be handled by system. Handle that stuff by yourself only when you absolutely have to- which is about as common as a necessity of using goto statement. The "catch" block is a breeding ground for the nastiest of bugs.

Justin James
Justin James

... it's also questionable that the API for Exception allows you to do things that can throw exceptions... it's a TOTALLY untestable piece of code. J.Ja

Tony Hopkinson
Tony Hopkinson

Something breaking the exception chain. Try ... Catch (Exception ex) ... throw ex; is the classic for that. Should be just throw, or throw new Exception("Fred",ex)

Tony Hopkinson
Tony Hopkinson

Got away with it because of good habits. Looks like there's a relaxation of the rules inside catch blocks. Using FXCop all the time, has simply engendered good habits in a catch block or not. If (ex.InnerException != null) { ... } Bugger...

Justin James
Justin James

... is that you don't always have access to it. You need to create an event source... which requires elevated permissions... to write to the log. In this particular case (I had so little access that I couldn't get a debugger on, remember?) I wasn't able to do that either. :( J.Ja

Tony Hopkinson
Tony Hopkinson

for last ditch catching, if you have to (say if you want it logged to file and/or the event viewer. Is Applications OnCurrentDomainUnhandledException event. But other than that never ever ever trap exceptions you can't deal with, is rule one.

Justin James
Justin James

Tony - I don't write the code that's being used and throwing the errors, but that's probably going on. It's OK to do that if the catch and re-throw adds real value and preserves the needed information (it's why InnerException was null as well), but it's not OK when the re-throw obfuscates the true, underlying problem. J.Ja

Tony Hopkinson
Tony Hopkinson

Your machine, 'plenty' ot time, all those wonderful tools and you are still f'ed. Though that happening is often a clue to the sort of error you are looking for somewhere. :(

xmetal
xmetal

We recently tried to figure out why something wasn't working on a partner's machine. Several things stopped us from being able to debug properly: 1. Inability to reproduce the issue elsewhere. This meant we had to work on this machine. 2. No debuggers or other dev tools installed. 3. Not allowed to install additional software. 4. Time constraints. This was for a stand-alone application (not a website or something like that) and a large portion of it is customizable and written in JScript.

jkameleon
jkameleon

Say no more. As far as I'm concerned, there's one word answer to this: Drupal.

Justin James
Justin James

It's a very long story, let's just say that if I *could* have a debugger attaching to things, I *would*, but Microsoft has made it very difficult to debug the "plugins" for their CRM product... J.Ja

jkameleon
jkameleon

> I had so little access that I couldn't get a debugger on, remember? Establishing the local debugging environment and trying to replicate bug there is generally less frustrating than debugging "blindly", the way we used to do it when I way young. I hated the meditation over program printed on paper so much I wrote symbolic debugger myself once. Why would you need to program without debugger anyway? If you are troubleshooting something that can't be replicated in developing environment, you need access to certain things, and that't it. The people in charge of production environment will just have to babysit you during the process, there is no other reasonable way to do it.

Tony Hopkinson
Tony Hopkinson

and one of the leaves is the Security event source. A moment's thought would have come up with the viability of say UserEvent Sources and System EventSources. but as usual user versus system privileges skipped right past MS's pointy heads.

Justin James
Justin James

It totally boggles my mind that the Event Log would need elevated access to create an Event Source, especially since once it is created, there is nothing stopping you from using it... J.Ja

Tony Hopkinson
Tony Hopkinson

The way we addressed this, was to setup a Source for us a ISP, then sub sources based on the various applications, and one called general. Code tries to find one for the current running application, if not it drops to General. Getting them in (or changing them was PIA though). Had to write an app that ran unelevated and reported the event sources. If they were different, then request elevated access (Have admin, UAC, log ion as Admin blah blah ) and then fix them and call that as prerequisite of the install. Took a while to get right that, supporting everything from XP to Win 7, Win2003 to to 2008R" and terminial services.. Quite why MS forced this extreme coupling with the registry I've yet to figure out. Certainly isn't the way I'd have approached it, logging some where to me is much more important than logging to exactly the right place and conking out if it doesn't exist. UI centric design again. Even if you do all the work to set up the event sources, there's nowt to stop someone with admin access deleteing it from the registry and frying your carefully prepared algorithm. So you end up getting an error you couldn't log something, and hope the thing you were trying to log is in it, otherwise you can't investigate the real fault.

Justin James
Justin James

I do my best to stay out of global.asax, so that never occurred to me! J.Ja

Editor's Picks