Hardware

Stretching the memory capacity of an ADO.NET Dataset

Datasets are ADO.NET's new structure for storing data in memory, but what happens when memory is exhausted? An example shows how memory can be used up and how .NET reacts to the problem.


Datasets are ADO.NET’s new structure for storing data in memory. In some ways, they are similar to ADO’s Recordset object; however, Datasets go well beyond Recordsets by allowing you to store an entire schema in memory—including tables, relationships, and keys—along with the actual data. You can then make queries and changes against the Dataset without affecting the underlying database used to populate the Dataset.

Given that the Dataset stores all its data in memory, some people are concerned about what happens if you run out of memory. It’s easy to take an educated guess about what will happen, but the reality is a bit more complex. To show you how .NET reacts, I will explain how to create a project that continually loads data into a Dataset until all memory is consumed. Note that this is not a recommended process, but it will show you that it isn't particularly easy to fill up the memory.

Creating the TooMuchData project
Creating a project that will continually load data into a Dataset is simple. Open Visual Studio .NET and create a new VB.NET Windows application. Add a button to the form and then double-click the button to get to the code window. Once in the code window, enter the following:
Private Sub Button1_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles Button1.Click
Dim sConnString As String = "Server=localhost;Database=pubs;uid=sa;pwd=;"
Dim sSQL As String = "SELECT * FROM authors"
Dim daProduct As SqlDataAdapter = New SqlDataAdapter(sSQL, sConnString)
Dim myDS As New DataSet()
Do While True
daProduct.Fill(myDS, "authors")
If myDS.Tables("authors").Rows.Count Mod 100 = 0 Then
Debug.WriteLine(myDS.Tables("authors").Rows.Count.ToString())
End If
Loop
End Sub


The third line of code sets the connection string into the variable sConnString. You'll likely need to modify this line if you try to re-create this project, unless you are running it on the server and have the famous "sa/no password" user ID and password.

The next line simply sets up a simple SQL query. Select * isn't usually recommended, but here it doesn’t matter because your goal is to grab as many records and as much data as possible. Selecting the primary key field of the table isn't a problem either, because you're creating a weakly typed Dataset and the DataTable you will be loading does not have a primary key field set.

The next two lines create DataAdapter and Dataset objects. You then enter an infinite loop that calls the Fill method of DataAdapter and add the records to a DataTable called “authors.” The loop does include an If statement that shows the count of the rows if the current count is evenly divisible by 100. This isn't necessary, but it helps in two ways: First, you know that the application is still running, and second, you know roughly how many records are in the Dataset.

Running the application
Once you've created the application, you might want to make some changes to your machine before running it. As a Dataset grows in size, it consumes more memory. Once it consumes all the memory, Windows begins swapping memory to the swap file. On many machines, the swap file is quite large, so the application can run for hours. For example, the test machine for my sample application was a laptop with a 600-MHz PIII and 512-MB RAM. The paging file size was set as low as possible to minimize the time required to run the application; this is a key step if you want to test it as quickly as possible. Even then, the application can run for minutes or hours before filling up all available memory.

You can run the application and start the processing by clicking the button. It establishes a connection to the server, pulls data from the table, and places it into a DataTable object in the Dataset. The same records are stored in the Dataset until memory is eventually exhausted. You can watch memory usage via Task Manager, and you'll even notice available memory shrinking and growing again as it swaps memory to disk. On my test machine, the Dataset grew to approximately 1.4 million records before the machine ran out of memory.

Running out of memory
Once the machine runs out of memory, what you see depends on whether you're running in VS.NET or whether you're running a compiled version. If you're running a compiled version and don’t have error handling, you'll get no errors at all; it simply quits running. If you're running the code in VS.NET and don’t have any error handling, the program will end, but you'll get the following message in the debug window:
Fatal out of memory error.
The program '[2340] TooMuchData.exe' has exited with code 0 (0x0).


At this point, you may be tempted to simply add an error handler to the code to check for a System.OutOfMemoryException condition. For example, you can use a Try…Catch block to watch for the exception. A rather generic approach appears below:
Do While True
Try
daProduct.Fill(myDS, "authors")
If myDS.Tables("authors").Rows.Count Mod 100 = 0 Then
Debug.WriteLine(myDS.Tables("authors").Rows.Count.ToString())
End If
Catch ex As Exception
MessageBox.Show("Error: " & ex.Message)
End Try
Loop


Unfortunately, this approach doesn't work. Setting a breakpoint on the MessageBox.Show statement should show when this point is hit, but this point is never reached. Instead, when you finally run out of memory, the debug window displays the following message:
Fatal out of memory error.
An unhandled exception of type 'System.OutOfMemoryException' occurred in system.data.dll


It would be nice if your MessageBox displayed this message; instead, VS.NET (or the Framework) generates the message and indicates that, in this instance, trapping this particular error isn't a valid solution.

If you compile the application with error handling and run it, you get a different result. This time, you get a MessageBox indicating that you've encountered an unhandled exception of type System.OutOfMemoryException in system.data.dll. However, this MessageBox is from the Framework and isn't your MessageBox from your code.

Not as easy as it sounds
You can create a Dataset that consumes all available memory, but it's difficult to do. It requires a significant number of records and, more importantly, a significant amount of time. Given that it can take hours for the Dataset to fill to the point of consuming memory, few applications would ever be allowed to run long enough for this to become a problem. Granted, every machine will be different, depending on processor speed, memory, and swap file size, but the indication is that this won't be a likely scenario.

However unlikely this scenario might be, though, it would be useful to be able to trap the error. Unfortunately, the error handling is unable to catch the System.OutOfMemoryException error in this case, which means you may have difficulty catching this particular error in your applications.

Editor's Picks

Free Newsletters, In your Inbox