Leadership

Multithreading tutorial, part three: Single Threaded Performance


This is the third installment of a multi-part series demonstrating multithreading techniques and performance characteristics in VB.Net. Catch up on the previous installments: Introduction to multithreading and The Application Skeleton.

In my previous post, I created the skeleton of a project that demonstrates multithreaded performance. In this post, we will be filling in the skeleton to dispatch the work to the correct function, and creating a performance baseline using a single thread.

During the testing for this post, it was determined that the Compute() function outlined in the previous post did not work as expected, so it has been revised slightly. Its concept is the same, but the computation has been tweaked a bit to eliminate overflow errors in the math. So now we have the following Compute() function:

    Private Function Compute(ByVal InputValue As Double) As Double

      Dim DoubleOutputValue As Double

      Dim DateTimeNow As DateTime

      Dim rndNumberGenerator As System.Random  DateTimeNow = New DateTime(DateTime.Now.Ticks)

      rndNumberGenerator = New System.Random(DateTimeNow.Hour + DateTimeNow.Minute + DateTimeNow.Millisecond)  If DateTimeNow.Millisecond > 500 Then

        DoubleOutputValue = System.Math.IEEERemainder(System.Math.Exp(rndNumberGenerator.Next * (InputValue + 5000) * System.Math.E), rndNumberGenerator.Next)

      Else

        DoubleOutputValue = rndNumberGenerator.Next(InputValue) / System.Math.Max(Double.MaxValue - 1, System.Math.Log(System.Math.Pow(System.Math.PI, InputValue)))

      End If  DateTimeNow = Nothing

      rndNumberGenerator = Nothing  Return DoubleOutputValue

    End FunctionOur single thread run looks like:Public Sub SingleThreadComputation(ByVal Iterations As Integer)

      Dim IntegerIterationCounter As Integer  For IntegerIterationCounter = 1 To Iterations

        Compute(Double.Parse(IntegerIterationCounter))

      Next

    End SubFinally, here are our performance characteristics for 1,000,000 iterations, in milliseconds per test:

    Test 1 Test 2 Test 3 Test 4 Test 5 Average
    System A 12031.250 12046.875 12125.000 11796.875 11906.250 11981.250
    System B 10937.500 10718.750 10734.375 10718.750 11000.000 10821.875
    System C 11890.320 11749.699 11765.323 12155.938 11765.323 11865.321
    System D 12359.454 12343.829 12359.454 12390.704 12406.329 12371.954
    Average 11760.100

    System A: AMD Sempron 3200 (1 logical x64 CPU), 1 GB RAM

    System B: AMD Athlon 3200+ (1 logical x64 CPU), 1 GB RAM

    System C: Intel Pentium 4 2.8 gHz (1 logical x86 CPU), 1 GB RAM

    System D: Two Intel Xeon 3.0 gHz (2 dual core, HyperThreaded CPUs providing 8 logical x64 CPUs), 2 GB RAM

    It is extremely important to understand the following information and disclaimers regarding these benchmark figures:

    They are not to be taken as absolute numbers. They are taken on real-world systems with real-world OS installations, not clean benchmark systems. They are not to be used as any concrete measure of relative CPU performance; they simply illustrate the different relative performance characteristics of different multithreading techniques on different numbers of logical CPUs, in order to show how different processors can perform differently with different techniques.

    The performance numbers on the single thread test show some truly fascinating results. System D (the dual Xeon machine) was actually our worst performer on the single threaded test. Although the Xeons seemed to suffer a bit on the single threaded performance, it is expected that they will maintain nearly identical performance when running 8 simultaneous threads non-atomically, while the single core CPUs should suffer penalty for running multithreaded.

    Another mildly interesting item to note is the difference in the clock reports between the AMD systems (A and B) and the Intel systems (C and D). The AMD CPUs rounded milliseconds up to units of 0.025 milliseconds (I showed 3 significant digits in the table, but the program returned 4 significant digits). Since all of the test machines are running the same version of the .Net Framework, this is obviously a difference between the two chipmakers. It is not relevant to the results of this test, but it is an interesting data point to remember for future use, in case it ever comes up.

    Stay tuned for the next post, which will show our first multithreaded test.

    J.Ja

About

Justin James is the Lead Architect for Conigent.

0 comments