Tim Chapman discusses your iteration choices in SQL Server 2008 and explains why determining which option performs better depends on the situation.
In a previous TechRepublic column about SQL Server 2005, I showed how you can convert some of your iterative queries (cursors) to set-based operations in order to enhance performance. You should take this approach in most circumstances, as you’ll potentially see performance gains when switching from a looping construct to a set-based construct.
There are situations in which you need to individually handle each row in a result set. SQL Server 2008 provides a minimal set of tools — cursors and WHILE loops — to accomplish this task. I’ll take a closer look at each option and explain why it’s difficult to pinpoint which is better in terms of performance.
Cursors are a looping construct built inside the database engine and come with a wide variety of features. Cursors allow you to fetch a set of data, loop through each record, and modify the values as necessary; then, you can easily assign these values to variables and perform processing on these values. Depending on the type of cursor you request, you can even fetch records that you’ve previously fetched.
Because a cursor is an actual object inside the database engine, there is a little overhead involved in creating the cursor and destroying it. Also, a majority of cursor operations occur in tempdb, so a heavily used tempdb will be even more overloaded with the use of cursors.
The types of cursors used are very important in terms of performance. Below is a list of the available cursor types as listed on Microsoft’s SQL Server Books Online.
A WHILE loop is a programming construct that you’re likely familiar with from other programming languages. You define a condition at the beginning of the loop, and iteration will occur so long as this condition remains to be true.
WHILE loops are as easy to use as a cursor, but sometimes these constructs are a little more difficult to read and/or to understand, as a query must occur on the base table(s) that must fetch a single next row. This type of operation is very tricky to do in situations where your underlying base table does not have a primary key column (which it should have).
WHILE loops don’t provide some of the bells and whistles that come with cursors, such as the ability to easily go “backward” in the result set. Although, I’ve never been in a situation in which I did anything other than a straight loop through a result set with a cursor.
One advantage of the WHILE loop is that no objects must be created in memory to facilitate the looping through a set of records as is necessary with a cursor.
It really depends on the situation. Let’s set up an example and take a look at a couple of specific scenarios.
The code snippet below creates and populates our table for testing our two constructs. Notice the fld3 field in the table declaration; I’m making this a larger field so that fewer rows are created on a page of data. In my opinion, this will make the test a little more realistic, as it will “simulate” a table that contains a variety of character columns along with some additional fields of data. This will essentially make it so that a search for a single value based on the primary key will have to traverse a couple of data pages before the value is found.
use tempdb
GO
IF OBJECT_ID('tempdb..CursorTest','u') IS NOT NULL
DROP TABLE CursorTest
GO
CREATE TABLE CursorTest
(
idcol INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
fld1 INT,
fld2 INT,
fld3 CHAR(800)
) GO
SET NOCOUNT ON
DECLARE @x INT = 10000
WHILE @x > 0
BEGIN
INSERT INTO CursorTest (fld1, fld2, fld3)
SELECT 1, RAND() * 100 * DATEPART(ms, GETDATE()), LEFT(REPLICATE(CAST(NEWID() AS VARCHAR(36)),30),800)
SET @x -= 1
END
With my data loaded, I can start to compare the cursor and the WHILE loop. (Note: These examples do no real processing.) I take the idcol field from the record and print it to the screen for each record in the table — simple stuff.
First, I’ll take a look at the cursor. In the example below, I use a FAST_FORWARD cursor. This type of cursor is the fastest cursor available in terms of strictly looping through records. As noted in the section on cursor types above, this is a combination of a READ_ONLY and a FORWARD_ONLY cursor.
DECLARE @Variable1 INT, @Variable2 INT
DECLARE CursorName CURSOR FAST_FORWARD
FOR
SELECT idcol
FROM CursorTest
OPEN CursorName
FETCH NEXT FROM CursorName
INTO @Variable1
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT CAST(@Variable1 AS VARCHAR(5))
FETCH NEXT FROM CursorName
INTO @Variable1
END
CLOSE CursorName
DEALLOCATE CursorName
To get a true look at how many reads are performed on the database, I’ll need to use SQL Server Profiler. This tool will give me a combined aggregation of the reads taken for each record rather than each individual record, as would be indicated through SET STATISTICS IO ON. The cursor example above produces 11114 logical reads from the database.
The following snippet performs the same operation as the cursor above but through the use of a WHILE loop. Notice that it is necessary to perform a query on the base table for each record returned. Also, the use of an ORDER BY statement is necessary, as is the use of a primary key column.
DECLARE @Rows INT, @IdCol INT
SET @Rows = 1
SET @IdCol = 0
WHILE @Rows > 0
BEGIN
SELECT TOP 1
@idcol = idcol
FROM CursorTest
WHERE
idcol >= @IdCol
ORDER BY idcol
SET @Rows = @@ROWCOUNT
PRINT CAST(@IdCol AS VARCHAR(5))
SET @IdCol += 1
END
The WHILE loop example performs 31130 logical reads on the base table.
So, for this example, the WHILE loop is almost three times as slow as the cursor approach. However, if I were to change the cursor example from a FAST_FORWARD to a STATIC, the number of logical reads goes from 11114 to 41327.
If you’re still wondering, so, which is better: cursors or WHILE loops? Again, it really depends on your situation. I almost always use a cursor to loop through records when necessary. The cursor format is a little more intuitive for me and, since I just use the constructs to loop through the result set once, it makes sense to use the FAST_FORWARD cursor. Remember that the type of cursor you use will have a huge impact on the performance of your looping construct.
The next time you’re faced with a looping situation use the tools discussed in this column and make a concerted effort to examine the performance of each approach. Comparing the number of reads from the database as well as the impact on tempdb (which I didn’t do here, but you can do through performance counters) will give you a lot of insight as to which option to use. The faster you can make your looping approach to solving the problem, the better off you’ll be.
TechRepublic’s Servers and Storage newsletter, delivered on Monday and Wednesday, offers tips that will help you manage and optimize your data center. Automatically sign up today!