Apps

Multithreading tutorial, part seven: Non-atomic performance


This is the final installment of a seven-part series demonstrating multithreading techniques and performance characteristics in VB.NET. Catch up on the previous installments:

This week's post on multithreading takes us full circle, back to non-atomic operations. Unlike the first post that tested performance, which performed them in a single thread, this one is multithreaded. Although it is multithreaded, and performs read/writes to shared variables, there is no thread safety whatsoever. As a result, in actual code where the contents shared variables are needed, they cannot be trusted. The variable that holds the number of completed computations is actually trustworthy, because all threads are doing the exact same thing, adding the number 1 to it. If it was doing something less predictable (such as multiplying by the iteration number or adding to a string of characters) the results would be chaotic at best. That is an important distinction to note: while this code is actually functional, in a real program it would probably not work, and definitely not work as expected!

You can download an installer for this program here. The installer will also install the full source code for the project, in a subdirectory of the installation path. Feel free to try it out yourself and tinker with it or just to look at it to get a better understanding of multithreading techniques.

This is the code that launches the threads:

Public Sub NonAtomicMultiThreadComputation(ByVal Iterations As Integer, Optional ByVal ThreadCount As Integer = 0)

  Dim twNonAtomic As NonAtomicThreadWorker

  Dim IntegerIterationCounter As Integer

  Dim iOriginalMaxThreads As Integer

  Dim iOriginalMinThreads As Integer

  Dim iOriginalMaxIOThreads As Integer

  Dim iOriginalMinIOThreads As Integer  twNonAtomic = New NonAtomicThreadWorker  Threading.ThreadPool.GetMaxThreads(iOriginalMaxThreads, iOriginalMaxIOThreads)

  Threading.ThreadPool.GetMinThreads(iOriginalMinThreads, iOriginalMinIOThreads)  If ThreadCount > 0 Then

    Threading.ThreadPool.SetMaxThreads(ThreadCount, ThreadCount)

    Threading.ThreadPool.SetMinThreads(ThreadCount, ThreadCount)

  End If  For IntegerIterationCounter = 1 To Iterations

    Threading.ThreadPool.QueueUserWorkItem(AddressOf twNonAtomic.ThreadProc, Double.Parse(IntegerIterationCounter))

  Next
  While NonAtomicThreadWorker.IntegerCompletedComputations < Iterations

  End While

  Threading.ThreadPool.SetMaxThreads(iOriginalMaxThreads, iOriginalMaxIOThreads)
  Threading.ThreadPool.SetMinThreads(iOriginalMinThreads, iOriginalMinIOThreads)

  twNonAtomic = Nothing

  IntegerIterationCounter = Nothing

End Sub

And here is the code for the class itself that performs the work:

Public Class NonAtomicThreadWorker

  Public Shared IntegerCompletedComputations As Integer = 0

  Private Shared DoubleStorage As Double  Public Property Storage() As Double

    Get

      Return DoubleStorage

    End Get

    Set(ByVal value As Double)

      DoubleStorage = value

    End Set

  End Property  Public Property CompletedComputations() As Integer

    Get

      Return IntegerCompletedComputations

    End Get

    Set(ByVal value As Integer)

      IntegerCompletedComputations = value

    End Set

  End Property  Public Sub ThreadProc(ByVal StateObject As Object)

    Dim ttuComputation As ThreadTestUtilities    ttuComputation = New ThreadTestUtilities    Storage = ttuComputation.Compute(CDbl(StateObject))

    CompletedComputations += 1

    ttuComputation = Nothing
  End Sub

  Public Sub New()

  End Sub
End Class

Here are the results of our tests. All tests are for 1,000,000 iterations, and the results are in milliseconds per test run

TEST 1

This test allows the ThreadPool to manage the total number of minimum and maximum threads on its own:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 16578.125 17296.875 15359.375 14453.125 19265.625 16590.625
System B 16296.666 16296.666 17562.275 15859.172 14140.444 16031.045
System C 17328.347 19140.870 19625.251 19531.500 23125.296 19750.253
System D 30250.194 30140.818 29531.439 30078.318 29500.189 29900.192
Average 20568.029

TEST 2

In this test, we limit the maximum number of threads to one per logical processor:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 11046.875 10796.875 10968.750 10906.250 10843.750 10912.500
System B 18624.762 19796.622 26874.656 13359.204 14577.938 18646.636
System C 12234.532 13390.796 26000.333 31641.030 12656.412 19184.621
System D 29468.939 29297.063 29468.939 29500.189 29406.438 29428.314
Average 19543.018

TEST 3

This test uses only one thread:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 10812.500 11078.125 12265.625 10781.250 13296.875 11646.875
System B 14749.811 14906.059 19718.498 38577.631 41999.462 25990.292
System C 12812.664 12609.536 13453.297 16078.331 13234.544 13637.674
System D 29406.438 29484.564 30234.569 29375.188 29468.939 29593.940
Average 20217.195

TEST 4

This test uses two concurrent threads:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 12937.500 13453.125 14218.750 15593.750 13718.750 13984.375
System B 54249.306 30396.266 30036.824 21266.850 18468.986 30883.646
System C 19531.500 17172.095 19656.502 18203.358 22312.786 19375.248
System D 29437.688 29359.563 29625.190 29437.688 29468.939 29465.814
Average 23427.271

TEST 5

Here we show four concurrent threads:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 22468.750 20437.500 23703.125 22828.125 21203.125 22128.125
System B 22719.041 32422.290 20484.637 30828.520 34328.564 28156.610
System C 36047.336 39578.632 37469.230 40000.512 34203.563 37459.855
System D 30297.069 29359.563 29343.938 29312.688 29359.563 29534.564
Average 29319.789

TEST 6

This test uses eight concurrent threads:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 24906.250 25046.875 24250.000 24812.500 24734.375 24750.000
System B 37453.604 36453.125 29078.125 30562.500 33890.625 33487.596
System C 32891.046 32266.038 32969.172 32516.041 32766.044 32681.668
System D 29406.438 29422.063 29375.188 29390.813 29422.063 29403.313
Average 30080.644

TEST 7

Finally, this test runs 16 simultaneous threads:

Test 1 Test 2 Test 3 Test 4 Test 5 Average
System A 24937.500 25125.000 24859.375 24546.875 24734.375 24840.625
System B 44749.427 43311.946 33749.568 36218.286 39358.871 39477.620
System C 32937.922 32719.169 32953.547 32734.794 32812.920 32831.670
System D 29468.939 45390.916 45765.918 34687.722 34578.346 37978.368
Average 33782.071

System A: AMD Sempron 3200 (1 logical x64 CPU), 1 GB RAM

System B: AMD Athlon 3200+ (1 logical x64 CPU), 1 GB RAM

System C: Intel Pentium 4 2.8 gHz (1 logical x86 CPU), 1 GB RAM

System D: Two Intel Xeon 3.0 gHz (2 dual core, HyperThreaded CPUs providing 8 logical x64 CPUs), 2 GB RAM

It is extremely important to understand the following information and disclaimers regarding these benchmark figures:

They are not to be taken as absolute numbers. They are taken on real-world systems with real-world OS installations, not clean benchmark systems. They are not to be used as any concrete measure of relative CPU performance; they simply illustrate the different relative performance characteristics of different multithreading techniques on different numbers of logical CPUs, in order to show how different processors can perform differently with different techniques.

As you can see, having no locks occurring results in significant performance gains over the tests that were performed with carious locking mechanisms. However, it is important to understand the ramifications of not using locking. If there is any type of data that needs to be shared amongst threads, locking will have to come into play. Judicial use of locking should prevent the performance hit from being too high. It is also important to note that for many situations, running your computations in a single thread will actually be faster than using multithreading; it all depends on your hardware and what you will actually be doing. As I have said before, "your mileage will vary." Test, test, and test again to see what techniques work best for your particular application.

We have come to the end of the end of this series. As always, feedback and comments are appreciated.

J.Ja

About

Justin James is the Lead Architect for Conigent.

0 comments