Data Management optimize

Detecting index fragmentation in SQL Server 2005

SQL Server 2005 has tools to help find fragmentation. With a little bit of programming, you can use these tools to see which tables need to be reindexed. Tim Chapman shows you to use the sys.dm_db_index_physical_stats function.

When data is inserted, updated, or deleted in a database table, if indexes exist that operate on the table, they must be maintained to reflect the table data changes. The maintenance of these indexes will eventually cause the indexes to become less efficient. This inefficiency is typically due to the index becoming fragmented.

Fragmentation in action

There are two types of fragmentation: external and internal. External fragmentation occurs when the logical ordering of the index does not match the physical ordering of the index. This causes SQL Server 2005 to perform extra work to return ordered results. For the most part, external fragmentation isn't too big of a deal for specific searches that return very few records or queries that return result sets that do not need to be ordered.

Internal fragmentation occurs when there is too much free space in the index pages. Typically, some free space is desirable, especially when the index is created or rebuilt. You can specify the Fill Factor setting when the index is created or rebuilt to indicate a percentage of how full the index pages are when created. If the index pages are too fragmented, it will cause queries to take longer (because of the extra reads required to find data) and cause your indexes to grow larger than necessary. If no space is available in the index data pages, data changes (primarily inserts) will cause page splits, which also require additional system resources to perform.

Finding fragmentation

In versions prior to SQL Server 2005, the DBCC SHOWCONTIG command is used to determine the fragmentation of a specified table or index. This tool does a good job of detailing fragmentation and gives an option to return the data as a dataset for additional programming.

SQL Server 2005 improves upon DBCC SHOWCONTIG by providing a dynamic management function to detect index fragmentation. The sys.dm_db_index_physical_stats function accepts parameters such as the database, database table, and index for which you want to find fragmentation. There are several options that allow you to specify the level of detail that you want to see in regards to index fragmentation. The sys.dm_db_index_physical_stats function returns tabular data regarding one particular table or index.

An example of fragmentation in action

Since sys.dm_db_index_physical_stats is a function, I should be able to use the new CROSS APPLY operator to join on my sys.tables system view to return fragmentation for all tables in the database -- but the function doesn't allow me to do so. I think it has something to do with the timing of when the function was written vs. when the CROSS APPLY functionality was written. The workaround is that I need to write a wrapper function around the sys.dm_db_index_physical_stats function to allow me to join with the sys.tables view. Here is the script to create this wrapper function:

CREATE FUNCTION sys_PhysicalIndexStatistics_Wrapper

 (

         @DatabaseID INT,

         @ObjectID INT,

         @IndexID INT,

         @PartitionNumber INT,

         @Mode INT

 )

 RETURNS @IndexStats TABLE

 (

         DatabaseID SMALLINT,

         ObjectID INT,

         IndexID INT,

         PartitionNumber INT,

         IndexDescription VARCHAR(100),

         AllocationTypeDescription VARCHAR(100),

         IndexDepth TINYINT,

         IndexLevel TINYINT,

         AverageFragmentation FLOAT,

         FragmentCount BIGINT,

         AverageFragmentSize FLOAT,

         TablePageCount BIGINT,

         AveragePageSpaceUsed FLOAT,

         RecordCount BIGINT,

         GhostRecordCount BIGINT,

         VersionGhostRecordCount BIGINT,

         MinimumRecordSize INT,

         MaxRecordSize INT,

         AverageRecordSize FLOAT,

         ForwardedRecordCount BIGINT

 )

 BEGIN     INSERT INTO @IndexStats

         (

             DatabaseID, ObjectID, IndexID, PartitionNumber, IndexDescription, AllocationTypeDescription, IndexDepth,

             IndexLevel, AverageFragmentation, FragmentCount, AverageFragmentSize, TablePageCount,

             AveragePageSpaceUsed, RecordCount, GhostRecordCount, VersionGhostRecordCount, MinimumRecordSize,

             MaxRecordSize, AverageRecordSize, ForwardedRecordCount

         )

     SELECT

             database_id, object_id, index_id, partition_number, index_type_desc,

             alloc_unit_type_desc, index_depth,

             index_level, avg_fragmentation_in_percent, fragment_count, avg_fragment_size_in_pages, page_count,

             avg_page_space_used_in_percent, record_count, ghost_record_count, version_ghost_record_count, min_record_size_in_bytes,

             max_record_size_in_bytes, avg_record_size_in_bytes, forwarded_record_count

        FROM   

                 sys.dm_db_index_physical_stats

              (

                        @DatabaseID,

                        @ObjectID,

                   @IndexID,

                   @PartitionNumber,

                   @Mode

              )

     RETURN

END

To call this function, I simply use the CROSS APPLY operator as a join between the sys.tables view and the wrapper function. Here is a sample call to return all tables where the AverageFragmentation is greater than zero:

SELECT t.name,s.*

 FROM sys.tables t

 CROSS APPLY sys_PhysicalIndexStatistics_Wrapper(DB_ID(), object_id, NULL, NULL, NULL) s

 WHERE AverageFragmentation > 0

The following is a screenshot of the results of the above function on a database with a reasonable amount of fragmentation.

Index fragmentation in SQL Server 2005

Now I need to know what the results are telling me. I am most interested in the AverageFragmentation column, which tells me the logical fragmentation for the index or heap. For anything over 30%, I will seriously consider reindexing the table. I can live with anything under 30%, but I might consider reorganizing the index.

Check out this Microsoft Books Online article about sys.dm_db_index_physical_stats. It details the remaining column data returned by this function.

Rebuild routine

I always have a routine in place on the database systems that I administer to automatically rebuild indexes when necessary. The routines are ones that I have written, and I take various details into account.

The more you work with the sys.dm_db_index_physical_stats function, the more you will understand when your tables need reindexing. By using the sys.dm_db_index_physical_stats function, it will hopefully help you ensure your tables and indexes are efficient as possible.

Tim Chapman a SQL Server database administrator and consultant who works for a bank in Louisville, KY. Tim has more than eight years of IT experience, and he is a Microsoft certified Database Developer and Administrator. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

-----------------------------------------------------------------------------------------

Get database tips in your inbox

TechRepublic's free Database Management newsletter, delivered each Tuesday, contains hands-on SQL Server and Oracle tips and resources. Automatically subscribe today!

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

0 comments