An introductory look at accessing data with ADO.NET DataSets

For developers who are new to ADO.NET, the idea of accessing data without the Recordset can be disconcerting. Discover ADO.NET's replacement, the DataSet, and find out how it works.

One of the more startling moments that an ADO developer experiences when learning about ADO.NET comes when you realize that there’s no Recordset in the latter’s object hierarchy. Instead, ADO.NET makes use of a new and significantly more powerful component to provide the functionality provided by ADO’s Recordset: the DataSet. Let's look at what a DataSet is, how it works, and how you access the data.

Overview of a DataSet
A DataSet, a resident of the System.Data namespace, is most precisely defined as a provider-neutral, in-memory, and disconnected relational data structure. It provides support for the standard view, add, remove, and update data operations for the data it represents, and it isn’t limited only to database data. DataSet is composed of several components, including:
  • DataTable: One or more DataTable objects represent the data from a source in familiar row and column format.
  • DataRow: Each DataTable contains multiple DataRow objects that contain the data from the source in a record or record-like layout.
  • DataColumn: Multiple DataColumn objects define the columns for each DataRow.

DataSets are completely provider neutral, meaning that no aspect of their functionality is dependent upon the underlying provider you use to connect to your data source. The fact that a DataSet is disconnected means that the underlying connection does not have to be open during the entire life of a DataSet object, which lets you make very efficient use of available database connections.

You can populate a DataSet with data in one of several different ways: Through a set of underlying provider-specific command objects, from an XML document or document fragment, or by supplying the data manually. I’ll explain more about retrieving data from a database, because that’s likely the first task you’ll want to perform. It’s important to keep in mind, however, that DataSets are flexible enough to not require a database source for data.

DataAdapter: The middleman
The job of translating between the provider-neutral DataSet and provider-specific connection and command objects falls to the DataAdapter. The DataAdapter for a DataSet takes care of retrieving, updating, inserting, and deleting data by making use of up to four internal command objects, which you can set or retrieve through this set of four public properties:
  • SelectCommand: Retrieves data from the data source to populate a DataSet
  • InsertCommand: Inserts new data to the data source from new data added to a DataSet
  • UpdateCommand: Updates existing data from changes made to the data in the DataSet
  • DeleteCommand: Deletes existing data when data is deleted from a DataSet

The DataAdapter class is abstract, so you can’t create it directly. Instead, you’ll need to use one of the provider-specific derived child classes. For example, to build a DataSet based on the contents of a hypothetical Customers table in a SQL Server database, you’d make use of the System.Data.SqlClient.SqlDataAdapter class. You could use something similar to the following C# code to accomplish this task, assuming that the SqlConnectioncn has already been created:
SqlCommand cm = new SqlCommand("SELECT * FROM Customers;",conn);
SqlDataAdapter sda = new SqlDataAdapter();
sda.SelectCommand = cm;
DataSet ds = new DataSet();


After creating a SqlCommand to represent the query you want to run, you would create a new SqlDataAdapter class and specify the SqlCommand as the SelectCommand for the adapter using the SelectCommand property. Note that you don’t need to define all four commands for an adapter to use it. You could also have passed the SELECT statement into SqlDataAdapter’s constructor method. Next, populate the DataSet with data using the adapter’s Fill method. At this point, the connection cn no longer needs to be open, and can be closed without compromising the usefulness of the data.

Accessing data
Once my DataSet has been filled, it’s a straightforward matter to access the data. Remember that each table in a DataSet is represented by a DataTable object, which is exposed through the DataSet’s Tables collection. Each DataTable, in turn, contains a collection of DataRows, which you can index into in order to retrieve the data for a given row. To extend my previous example, if you then wanted to print the contents of the first column in the first row of the Customers to the console, you could use the following code:

You can also retrieve columns by name by passing a string index for the column. Assuming that the first column in Customer is called CustomerID, the following code would work just as well, and be much cleaner:

For cleaner code still, it’s possible to extract a DataRow from the Rows collection and access it directly:
DataRow dr = ds.Tables[0].Rows[0];


Working with multiple tables
Thus far, I’ve not mentioned one particularly cool feature of DataSet: It can represent data from multiple tables. You probably noticed that the above code indexed into a Tables collection before accessing a row of data. You can access multiple DataTables in this fashion, either by index or by name.

To fill a DataSet with multiple DataTables, you can use multiple DataAdapters:
SqlDataAdapter sda = new SqlDataAdapter("SELECT * FROM Customers;",cn);
SqlDataAdapter sda2 = new SqlDataAdapter(“SELECT * FROM Products;”,cn);
DataSet ds = new DataSet();


The data from Customers will be available in the first DataTable in the ds.Tables collection, while the Products data will be the second item. You could also refer to these tables by name. ADO.NET assigns generic names to each DataTable automatically. In this case, Customers would be named Table, while Products would be named Table1. You can override these default names by specifying a name when calling the Fill method like so:


While using multiple DataAdapters works, it’s inefficient because each Fill method requires a trip to the database. If your database provider supports it, you can use a single adapter and a batch query to load the data:
SqlDataAdapter sda = new SqlDataAdapter(
 “SELECT * FROM Customers;” +
“SELECT * FROM Products;”);


However, you’re stuck with the default table-naming behavior once again. To work around this, try defining new TableMapping properties for the adapter:


Only scratching the surface
DataSet is the centerpiece of Microsoft’s ADO.NET data access API, and as such, it exposes many powerful features. I’ve only discussed a small part of the functionality available as part of the DataSet object. I will cover more on this functionality in future articles.


Editor's Picks