Software Development

Compiler Optimizations Can Only Go So Far


My most recent post ("How I Improved Execution Speed By 100 Times In 5 Minutes") generated some tremendous feedback. One of the things that was touched upon a number of times was the possibility of the compiler or managed code environment caching values used in loops, or automagically transferring the value to an invariant, as I did in my sample code. One poster, Kevmeister, made some great points, which I would like to discuss in more detail.

Here is what Kevmeister says regarding the compiler performing these kinds of optimizations:

The chief problem is that "properties" might be public member variables or they might be accessor functions. How in fact does a compiler actually determine that the expression it is dealing with is in fact constant? If it is an accessor function how does it know what the behaviour of the accessor function is in computing the value of that property? Even if it is a public member variable, how does it know that the some other part of the loop will not cause a side-effect resulting in that public member variable being modified. I think an optimiser needs to be thoroughly clever to determine these kinds of things, if it is possible at all.

Well said indeed. The compiler really has little way of knowing for sure that the value has not changed. Even if the compiler tore apart the source code to the object used to generate the property, it has little way of deciding what could be cached and what could not be cached, unless it was able to track the value back to a hardcoded number or something declared as a constant.

Another issue that Kevmeister did not mention is multithreading. What happens if the looping code is in one thread, and another thread does something that would change that value? It is bad enough in a For loop, where the idea of the check value suddenly jumping around could be quite dangerous, but that may indeed be the desired behavior in a While or Until loop. Some people will put a loop in one thread, checking a value that gets changed by a separate thread. So in a multithreaded situation, caching the value is not very helpful.

Another unmentioned idea for caching these variable would be for the compiler to attach a dirty bit to some values. If the value itself (or any underlying data) were to every change, the dirty bit would be set. It would barely impact performance. Any variable can subscribe to the dirty bit of another variable, allowing a bubble up effect. This idea has some merit, but although the overhead on an individual level is rather small in most cases, on a mass scale, it could be devastating to performance. And it still does not address the issue of what happens when the value to be checked is derived rather than stated. Consider the following loop statement (I just made up a timer class here, please ignore any similarity to an actual object):

while dtTimeToStop >= timerProcessTimer.CurrentRunTime()

If the compiler attempts to cache CurrentRunTime(), there will be a severe performance ding. Even if the compiler is updating its cached value periodically, the cure is worse than the disease.

Kevmeister also pointed out that something within the loop may change that value. What he does not say, and blow the whole idea out of the water, is what if accessing the value itself changes it? Here is a fake class:

Public Class Class1

    Private intWhenToStop As Integer

    ReadOnly Property WhenToStop() As Integer
        Get
            If intWhenToStop > 0 Then
                intWhenToStop -= 1
            End If

            Return intWhenToStop
        End Get

    End Property

    Sub KeepGoingUntilDone()

        Dim intCounter As Integer

        For intCounter = 0 To WhenToStop
            'Do Stuff
        Next
   End Sub

    Public Sub New()
        intWhenToStop = 500

    End Sub

End Class

See the problems with caching or otherwise touching WhenToStop() by anything other than the programmer's code? Yes, it is a rather bizarre example. But I have seen situations where this kind of code does indeed make the most sense (especially before the first few pots of coffee).

So the idea of the compiler or managed code environment automatically handling these situations is really not such a great one.

But that still does not let the compiler off the hook completely. It is my belief that just as the compiler throws out warnings and errors, it should also throw out suggestions and hints as well. Visual Studio is very intelligent about many things. I could see the IntelliSense system being beefed up to give some performance tips. Not many programmers get taught any particular language, and far too few programmers read about programming outside of reference books for syntax lookup. With that in mind, I think that while the compiler automatically handling these types of issues is not a good idea, the idea of the compiler making some helpful suggestions is not a bad one at all.

J.Ja

About

Justin James is the Lead Architect for Conigent.

0 comments

Editor's Picks