When I teach classes on Microsoft technology, I tell my students—somewhat facetiously—that if Microsoft makes something easy in the design phase of a development project, it’s a sure sign that you shouldn’t use it in a production application. Perhaps the two best examples of this axiom are the data-bound control from Visual Basic 3.0 and the Visual Interdev Design Time Control (DTC). The VB3 data-bound control made great demos, but its performance effects on the underlying database had even Microsoft’s own consulting services group recommending against its use in a production application.

Interdev DTCs are legendary for the number of developers who were suckered into using them for a simple application, only to find that they had to rewrite the functionality from scratch when they wanted to extend them or modify their output because the DTC was neither modifiable nor extensible. Consequently, when I first saw the .NET DataSet object, I was cautiously optimistic. Unfortunately, many developers chose not to be cautious at all.

What’s wrong with the DataSet?
I’m not saying that there’s anything inherently wrong with the DataSet object. But it’s like any other tool—you need to understand how to use it appropriately. Although it’s a useful tool for Windows Forms applications, it’s much less useful for Web application development.

Let’s look at a simple example. Suppose you use a DataSet to return a set of 1,000 products to display in a DataGrid on a form. Since you might want to sort or filter the data later, you choose to save the DataSet in a session variable. Not knowing any better, you also leave the default page ViewState turned on. When a user navigates to this page, there are three copies of the data somewhere in memory. It’s on the server saved in a session-level variable. It’s in the ViewState stored as the contents of the DataGrid. And it’s in the rendered HTML stream in the form of HTML table directives that render the table. Now multiply the server memory by the number of users to assess impact on server memory, and multiply the two copies of the data by the number of users to assess the impact on bandwidth utilization. You can quickly overload a server and its available network bandwidth on a high-traffic site.

The answer: Use the DataReader
Though not as sexy, the DataReader is much more functional for a Web application. Because the DataSet object’s cursor is designed to iterate in a forward-only, read-only fashion over the results of a query, it’s very fast. Moreover, the DataReader only holds the current record in memory at any one time—never the entire results set. The DataSet object can be bound to ASP.NET Server Controls (like the DataGrid). More importantly, server resources and connection resources are released as soon as you’re finished traversing them. Build your data-bound pages using DataReaders to retrieve data from an underlying database whenever it’s important for the data to be as fresh as possible.

When should I use the DataSet?
The only time I recommend using the DataSet in a Web application is when the underlying data changes on an occasional basis. For example, if you have a series of drop-down boxes or checklist items that come from a database but rarely change, it may make sense to load them in a DataSet in the Application_OnStart event and put them into the cache so that any page that needs to get the values will have them immediately available. This not only makes data retrieval faster for each page but also minimizes the number of hits to the underlying database. You can get an additional speed boost by caching the Web pages, which rely on the cached DataSet for their values.

By setting a dependency between the cached Web pages and the cached DataSet, the Web pages will be regenerated whenever the DataSet changes. To make sure the DataSet is always current, you should create Update, Insert, and Delete triggers on the tables in the DataSet that modify a control file on the site. Then set a dependency between the cached DataSet and the control file. Whenever the control file changes, the DataSet in the cache will be invalidated. Add code to the Session_OnStart event to check for the cached DataSet, regenerate it if necessary, and place it back in the cache. Then whenever the underlying tables change, the cached DataSet will be regenerated.

Using the right tool for the right job is the best way to create optimized Web applications. Now you have some general guidelines for the right time to use the DataReader and the DataSet in your ASP.NET applications.