By Mark A. Strawmyer

Prior to the release of the Microsoft .NET Framework, programmers had few data access technology choices for interacting with data stored in a database. An ActiveX Data Object (ADO) Recordset was the option commonly used to hold query results and data contents. It originally required a constant database connection to maintain access to the data. Over time it was evolved to support disconnected applications as well as provide some limited support for eXtensible Markup Language (XML).

The release of ADO.NET offers a new object model for retrieving data. Now there are two data classes for query results and a slew of other classes to hold various data content. Since most programmers are used to only having a single Recordset, this raises the questions of why there are two data retrieval classes in Microsoft .NET and which is the best to use.

What is a DataReader?
A DataReader is a read-only stream of data returned from the database as the query executes. It only contains one row of data in memory at a time and is restricted to navigating forward only in the results one record at a time. The DataReader does support access to multiple result sets, but only one at a time and in the order retrieved. Just as in the original version of ADO, the data is no longer available through the DataReader once the connection to the data source is closed, which means a DataReader requires a connection to the database throughout its usage. Output parameters or return values are only available through the DataReader once the connection is closed.

DataReaders all implement the same IDataReader interface, which means that all of the DataReaders should provide the same set of functionality. Each DataReader implementation is optimized for a specific data provider such as SQL Server or Oracle. Version 1.0 of the Microsoft .NET Framework includes a System.Data.SqlClient.SqlDataReader that is specific to SQL Server and a System.Data.OleDb.OleDbDataReader that is more generic for OLEDB providers. Version 1.1 of the .NET Framework introduced the SqlCeDataReader for connecting to the SQL Server edition for Windows CE, OdbcDataReader for ODBC data sources, and OracleDataReader for connecting to Oracle databases. It isn”t required that you use the DataReaders from Microsoft, and each database vendor often has its own .NET provider. In addition, you can implement a DataReader of your own through the IDataReader interface.

Sample DataReader C# code
The C# sample code in Listing A demonstrates how to use a DataReader to retrieve data from the Northwind sample database included with Microsoft SQL Server. A connection is set up, the command object is created to retrieve the data, the connection is opened, and the data is retrieved and printed to the console window. The connection will automatically be closed when the DataReader is closed by using the appropriate CommandBehavior options on the Command object, which is a good habit to follow to ensure connections are not accidentally left open.

What is a DataSet?
DataSet is the core of the ADO.NET disconnected architecture and is used to store data in a disconnected state. It was designed to fully support the use of XML so an XML document can be read into a DataSet or a DataSet can be exported to XML. This allows a DataSet to be easily transported across a network so it can be used as a return from a Web service or other type of remote call. Changes to a DataSet can be propagated back to the data source from where the data originated. A DataSet is fully navigable forward or backward and the contents can be filtered, sorted, or searched as desired, making all records accessible at any given point.

The DataSet, part of the System.Data namespace, is not specific to any one data provider, but rather is independent of them. A DataSet relies on a DataAdapter specific to each provider to be the intermediary between the DataSet and the data store. The DataAdapter uses a DataReader for the specific provider to fill the contents of the DataSet.

Version 1.0 of the Microsoft .NET Framework includes a SqlDataAdapter specific to SQL Server and an OleDbDataAdapter that is more generic for OLEDB Providers. Version 1.1 of the .NET Framework introduced the OdbcDataAdapter for ODBC data sources, and OracleDataAdapter for connecting to Oracle databases. In addition, each database vendor often has its own .NET provider. In addition, you can implement your own DataAdapter through the IDataAdapter interface.

In prior implementations of ADO, a Recordset could access multiple query results, but was not aware of relationships between them. A DataSet can expose a hierarchical model of tables similar to a relational database. The object model can be used to retrieve records related to the current record. A DataSet can contain one or more DataTable objects that have primary key, foreign key, and constraints between them and can enforce constraints such as unique or be configured to ignore them. Rather than using a join to combine tables of data, you typically use a DataAdapter to retrieve each table separately, populate the DataSet contents with the multiple adapters, and then link them together in the DataSet through DataRelation objects.

Sample DataSet C# code
The C# sample code in Listing B demonstrates how to use a DataSet to retrieve the same data as the example above. This time a connection does not have to be opened or closed because the DataAdapter will automatically handle it.

How do I know which one to use?
You can use the following information to help you decide which data retrieval is right for your purposes.

When to consider using a DataReader:

  • The DataReader is a better choice for applications that require optimized read-only and forward-only access to data such as binding to a DataGrid control. The sooner the data is off-loaded from the DataReader and the connection is closed the better the application performance.
  • The DataReader is a better choice when you are planning to make repeated calls to retrieve small amounts of information or the data you are retrieving must be as up to date as possible each time it is used.

When to consider using a DataSet:

  • The DataSet is a better choice for applications that will not off-load the query result immediately, or when there is extensive processing such as complex business logic involved between data accesses. The DataSet will retrieve the data, off-load the data into memory and return the database connection to the connection pool, where as a DataReader would keep the connection locked open until processing is complete. This could easily cause a high traffic application to run out of available database connections.
  • The DataSet is a better choice when you need to navigate through the data more than once. For example, if you have multiple controls you need to build off the same data, then a DataSet is the better answer because a DataReader can only be read once so it can only be bound to a single control and would require the data to be retrieved for use with each control.
  • The DataSet is a better choice when the data does not change frequently enough to warrant always retrieving it from the database or is specific to the user requesting the data. A DataSet can be stored in Session or Application variables or cached through the System.Web.Caching.Cache class to improve application performance by not having to retrieve the data from the database each time it is needed.
  • The DataSet is a better choice when building a Web service that will return the retrieved data. Since a DataSet is serializable it can serve as the return value. Since a DataReader requires a persistent database connection, it cannot be used as a return type from a Web service.

It is important to remember there are now two data retrieval classes. Each was created for a purpose that is distinct from the other. Consider your options carefully before you make your decision.