CXO

Comparing cursor vs. WHILE loop performance in SQL Server 2008

Tim Chapman discusses your iteration choices in SQL Server 2008 and explains why determining which option performs better depends on the situation.

Verfasst von

chapman.tim-gmail

Nov 11, 2009

We may earn from vendors via affiliate links or sponsorships. This might affect product placement on our site, but not the content of our reviews. See our Terms of Use for details.

In a previous TechRepublic column about SQL Server 2005, I showed how you can convert some of your iterative queries (cursors) to set-based operations in order to enhance performance. You should take this approach in most circumstances, as you’ll potentially see performance gains when switching from a looping construct to a set-based construct.

There are situations in which you need to individually handle each row in a result set. SQL Server 2008 provides a minimal set of tools — cursors and WHILE loops — to accomplish this task. I’ll take a closer look at each option and explain why it’s difficult to pinpoint which is better in terms of performance.

Cursors

Cursors are a looping construct built inside the database engine and come with a wide variety of features. Cursors allow you to fetch a set of data, loop through each record, and modify the values as necessary; then, you can easily assign these values to variables and perform processing on these values. Depending on the type of cursor you request, you can even fetch records that you’ve previously fetched.

Because a cursor is an actual object inside the database engine, there is a little overhead involved in creating the cursor and destroying it. Also, a majority of cursor operations occur in tempdb, so a heavily used tempdb will be even more overloaded with the use of cursors.

The types of cursors used are very important in terms of performance. Below is a list of the available cursor types as listed on Microsoft’s SQL Server Books Online.

FORWARD_ONLY
Specifies that the cursor can only be scrolled from the first to the last row. FETCH NEXT is the only supported fetch option. If FORWARD_ONLY is specified without the STATIC, KEYSET, or DYNAMIC keywords, the cursor operates as a DYNAMIC cursor. When neither FORWARD_ONLY nor SCROLL is specified, FORWARD_ONLY is the default, unless the keywords STATIC, KEYSET, or DYNAMIC are specified. STATIC, KEYSET, and DYNAMIC cursors default to SCROLL. Unlike database APIs such as ODBC and ADO, FORWARD_ONLY is supported with STATIC, KEYSET, and DYNAMIC Transact-SQL cursors.

STATIC
Defines a cursor that makes a temporary copy of the data to be used by the cursor. All requests to the cursor are answered from this temporary table in tempdb; therefore, modifications made to base tables are not reflected in the data returned by fetches made to this cursor, and this cursor does not allow modifications.

KEYSET
Specifies that the membership and order of rows in the cursor are fixed when the cursor is opened. The set of keys that uniquely identify the rows is built into a table in tempdb known as the keyset. Changes to nonkey values in the base tables, either made by the cursor owner or committed by other users, are visible as the owner scrolls around the cursor. Inserts made by other users are not visible (inserts cannot be made through a Transact-SQL server cursor). If a row is deleted, an attempt to fetch the row returns an @@FETCH_STATUS of -2. Updates of key values from outside the cursor resemble a delete of the old row followed by an insert of the new row. The row with the new values is not visible, and attempts to fetch the row with the old values return an @@FETCH_STATUS of -2. The new values are visible if the update is done through the cursor by specifying the WHERE CURRENT OF clause.

DYNAMIC
Defines a cursor that reflects all data changes made to the rows in its result set as you scroll around the cursor. The data values, order, and membership of the rows can change on each fetch. The ABSOLUTE fetch option is not supported with dynamic cursors.

FAST_FORWARD
Specifies a FORWARD_ONLY, READ_ONLY cursor with performance optimizations enabled. FAST_FORWARD cannot be specified if SCROLL or FOR_UPDATE is also specified.
READ_ONLY
Prevents updates made through this cursor. The cursor cannot be referenced in a WHERE CURRENT OF clause in an UPDATE or DELETE statement. This option overrides the default capability of a cursor to be updated.
SCROLL_LOCKS
Specifies that positioned updates or deletes made through the cursor are guaranteed to succeed. SQL Server locks the rows as they are read into the cursor to ensure their availability for later modifications. SCROLL_LOCKS cannot be specified if FAST_FORWARD or STATIC is also specified.

OPTIMISTIC
Specifies that positioned updates or deletes made through the cursor do not succeed if the row has been updated since it was read into the cursor. SQL Server does not lock rows as they are read into the cursor. It instead uses comparisons of timestamp column values, or a checksum value if the table has no timestamp column, to determine whether the row was modified after it was read into the cursor. If the row was modified, the attempted positioned update or delete fails. OPTIMISTIC cannot be specified if FAST_FORWARD is also specified.

WHILE loops

A WHILE loop is a programming construct that you’re likely familiar with from other programming languages. You define a condition at the beginning of the loop, and iteration will occur so long as this condition remains to be true.

WHILE loops are as easy to use as a cursor, but sometimes these constructs are a little more difficult to read and/or to understand, as a query must occur on the base table(s) that must fetch a single next row. This type of operation is very tricky to do in situations where your underlying base table does not have a primary key column (which it should have).

WHILE loops don’t provide some of the bells and whistles that come with cursors, such as the ability to easily go “backward” in the result set. Although, I’ve never been in a situation in which I did anything other than a straight loop through a result set with a cursor.

One advantage of the WHILE loop is that no objects must be created in memory to facilitate the looping through a set of records as is necessary with a cursor.

Which is better in terms of performance?

It really depends on the situation. Let’s set up an example and take a look at a couple of specific scenarios.

The code snippet below creates and populates our table for testing our two constructs. Notice the fld3 field in the table declaration; I’m making this a larger field so that fewer rows are created on a page of data. In my opinion, this will make the test a little more realistic, as it will “simulate” a table that contains a variety of character columns along with some additional fields of data. This will essentially make it so that a search for a single value based on the primary key will have to traverse a couple of data pages before the value is found.

use tempdb
GO

IF OBJECT_ID('tempdb..CursorTest','u') IS NOT NULL
DROP TABLE CursorTest
GO

CREATE TABLE CursorTest

idcol INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

fld1 INT,

fld2 INT,

fld3 CHAR(800)

)
GO

SET NOCOUNT ON
DECLARE  @x INT = 10000
WHILE @x > 0
BEGIN
INSERT INTO CursorTest (fld1, fld2, fld3)
SELECT 1, RAND() * 100 * DATEPART(ms, GETDATE()), LEFT(REPLICATE(CAST(NEWID() AS VARCHAR(36)),30),800)
SET @x -= 1
END

With my data loaded, I can start to compare the cursor and the WHILE loop. (Note: These examples do no real processing.) I take the idcol field from the record and print it to the screen for each record in the table — simple stuff.

First, I’ll take a look at the cursor. In the example below, I use a FAST_FORWARD cursor. This type of cursor is the fastest cursor available in terms of strictly looping through records. As noted in the section on cursor types above, this is a combination of a READ_ONLY and a FORWARD_ONLY cursor.

DECLARE @Variable1 INT, @Variable2 INT

DECLARE CursorName CURSOR FAST_FORWARD
FOR
SELECT idcol
FROM CursorTest

OPEN CursorName
FETCH NEXT FROM CursorName
INTO @Variable1

WHILE @@FETCH_STATUS = 0

BEGIN

PRINT CAST(@Variable1 AS VARCHAR(5))

FETCH NEXT FROM CursorName
INTO @Variable1

END

CLOSE CursorName

DEALLOCATE CursorName

To get a true look at how many reads are performed on the database, I’ll need to use SQL Server Profiler. This tool will give me a combined aggregation of the reads taken for each record rather than each individual record, as would be indicated through SET STATISTICS IO ON. The cursor example above produces 11114 logical reads from the database.

The following snippet performs the same operation as the cursor above but through the use of a WHILE loop. Notice that it is necessary to perform a query on the base table for each record returned. Also, the use of an ORDER BY statement is necessary, as is the use of a primary key column.

DECLARE @Rows INT, @IdCol INT
SET @Rows = 1
SET @IdCol = 0
WHILE @Rows > 0
BEGIN
SELECT TOP 1
@idcol = idcol
FROM CursorTest
WHERE
idcol >= @IdCol
ORDER BY idcol

SET @Rows = @@ROWCOUNT
PRINT CAST(@IdCol AS VARCHAR(5))
SET @IdCol += 1
END

The WHILE loop example performs 31130 logical reads on the base table.

So, for this example, the WHILE loop is almost three times as slow as the cursor approach. However, if I were to change the cursor example from a FAST_FORWARD to a STATIC, the number of logical reads goes from 11114 to 41327.

Try both approaches

If you’re still wondering, so, which is better: cursors or WHILE loops? Again, it really depends on your situation. I almost always use a cursor to loop through records when necessary. The cursor format is a little more intuitive for me and, since I just use the constructs to loop through the result set once, it makes sense to use the FAST_FORWARD cursor. Remember that the type of cursor you use will have a huge impact on the performance of your looping construct.

The next time you’re faced with a looping situation use the tools discussed in this column and make a concerted effort to examine the performance of each approach. Comparing the number of reads from the database as well as the impact on tempdb (which I didn’t do here, but you can do through performance counters) will give you a lot of insight as to which option to use. The faster you can make your looping approach to solving the problem, the better off you’ll be.

TechRepublic’s Servers and Storage newsletter, delivered on Monday and Wednesday, offers tips that will help you manage and optimize your data center. Automatically sign up today!

chapman.tim-gmail

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.