Data Management

Tips on optimizing your SQL Server indexes

Proper indexing is a major factor in performance tuning your SQL Server database. Follow these tips to get your applications in tiptop shape.

Within SQL Server there are several tools that you can employ to monitor, tune, and optimize performance. Here, I will explain how to utilize and interpret the results of SQL Server tools to optimize the use of database indexes. I'll also discuss general indexing topics.

General indexing information
Indexes are one of the biggest determinates of database performance. I will only touch on the fundamentals of this topic due to its complexity, but there are many good books available on this topic that provide a more in-depth treatment. The two types of SQL Server indexes I will discuss are clustered and nonclustered indexes. When reviewing what type of index to create, you should identify the data type and the column(s) stores. Also, you must consider what query types will run and the frequency with which they will be executed.

What type of index?
Columns that contain a relatively high degree of duplicate values and that are accessed in sequence are good candidates for clustered indexes because SQL Server physically reorders the data rows and stores the values in either ascending (the default) or descending order so they can be quickly retrieved. Also, columns that are searched in a range are good candidates. Due to the physical reordering of data you can only have one clustered index per table.

Conversely, columns that contain many unique values are good candidates for nonclustered indexes. You may have up to 249 nonclustered indexes per table, although I can’t imagine a real world application requiring nearly that many indexes.

When tables use primary keys, SQL Server automatically (by default) creates a unique cluster index on the column(s) comprising the key. Clearly, the uniqueness of the primary key is enforced through the creation of the unique index on the column(s). When creating foreign key relationships, it's a good idea to create a nonclustered index on the foreign key column if you're planning on using it frequently in joins. Tables that have a clustered index maintain the relationship between the data pages via a linked list (e.g., leaf and nonleaf level). Conversely, if no clustered index exists on a table, SQL Server will store the data pages in a heap.

Data pages
When an index is created, SQL Server creates data pages, which are pointers that assist in finding the data row that contains the information of interest. When the index is established, a fillfactor is set. The purpose of a fillfactor is to designate the percentage of the data pages filled upon index creation. Over time, the free space is consumed as modifications occur, which causes page splits. The result of the page splits is that they degrade the performance of the indexes and thus the queries that utilize them by causing the data storage to become fragmented. The index’s fillfactor is set at the time of the indexes creation and isn't dynamically maintained.

To update the fillfactor in the data pages, we can drop and recreate the indexes and reset the fillfactor (keep in mind that this will negatively impact concurrent database activity and should be used judiciously in production systems). DBCC INDEXDEFRAG and DBCC DBREINDEX are statements that defragment both clustered and nonclustered indexes. INDEXDEFRAG is an online operation (i.e., it does not block other table activity, such as a query), whereas DBREINDEX physically rebuilds the index(s). In most cases, rebuilding an index achieves greater defragmentation, but it comes at the cost of blocking concurrent activity on that table. INDEXDEFRAG can take longer to complete when large fragmented indexes are present because it executes in small transactional blocks.

The fillfactor
When you perform one of these actions, the database engine can more efficiently return indexed data. Fillfactor tweaking is beyond the scope of this article, but should be employed with a careful eye toward the intended usage of the table for which the index is created.

SQL Server dynamically chooses which indexes to use during the execution of a query. To make this choice, SQL Server uses statistics about the distribution of keys in each index to determine which indexes it will use to process the query. It is essential to consider that the statistics SQL Server is using can become out of date during the course of normal database activity (e.g., inserts, deletes, and updates on the table). To ascertain the current status of your statistics, you can execute DBCC SHOWCONTIG. When you determine that your statistics are out of date, you should run the UPDATE STATISTICS statement on the table to allow SQL Server to refresh its information about the indexes.

Establish a plan
SQL Server provides a utility that simplifies and automates the maintenance of a database, including the indexes. This tool is called the Database Maintenance Plan Wizard (DMPW). If you run this wizard, you'll see that you can schedule the statistics on the indexes in the database to be updated as a regularly scheduled job, thus alleviating the burden of manually rebuilding your indexes. Another mutually exclusive choice in the DMPW is to reorganize data and data pages, which effectively drops and recreates your indexes with a specified fillfactor.


Editor's Picks