In the database world, partitioning is the horizontal splitting of rows in a table or index defined on a table in separate, smaller logical units. In fact, every table in SQL Server 2005 has at least one partition. You are essentially breaking one table or index up into smaller tables or indexes that act as one table or index. This can greatly increase performance because you are querying several smaller tables rather than one much larger one. The advantages increase when the server has multiple processors, which allow the partitions to be queried in parallel in the same query, and when the partitions are spread across multiple file groups.
Prior to SQL Server 2005, the process of partitioning tables was much more involved for DBAs; the partitioned tables were actually user tables in the database that were typically "unioned" together in a view to simulate one large table. From there, DBAs could define INSTEAD OF triggers on the view to manipulate the data in the underlying tables. In SQL Server 2005, administration is much easier because the database engine maintains the partitions in the background, allowing you to focus on other tasks.
It's not practical to partition every table in your database. The typical usage scenario for a partitioned table is one that is very large and is experiencing degrading performance over time. I'll walk you through the steps of partitioning a large table.
Setting up the partitioning example
My example demonstrates how to partition a table that stores archived data to increase query performance. Before diving into the code for the example, here's a brief overview of the moving parts that make up a partitioned table.Partition Functions
When a table is partitioned, it is broken horizontally into smaller table partitions that are used together to assimilate the whole table. To do this, SQL Server has to know how to break the table into smaller parts. This is where a Partition Function comes into play. A Partition Function is the logic that maps the rows of a table or index into predefined partitions based on the values in a field of data. This will be the first partition object I create in the example.
You tell SQL Server how to horizontally split a table with a Partition Function. You also need to indicate how the partitions will be stored in the database. In SQL Server 2005, you map a table partition to a filegroup, which is basically a logic grouping that the database uses to store its data files and transaction log files. Every database has at least one filegroup named the Primary filegroup and additional filegroups for administrative and performance purposes. In a Partition Scheme, you can define that all partitions are mapped to the same filegroup, or you can use the scheme to split up the partitions across filegroups. The advantage is that, when the filegroups are split across individual disks, SQL Server will be better equipped to use resources when retrieving data. This advantage becomes even greater when you are running queries against the partitioned tables on a server with multiple processors.
Creating a partitioned table
Step one is to define the partition function that you will use to map out the partitions in the table. In the Partition Function defined below, I use three partitions, one for each product type in the SalesHistoryArchive table, which I will define later. These partitions will essentially break the SalesHistoryArchive archive table into three different tables that are automatically maintained by SQL Server.
CREATE PARTITION FUNCTION [pf_Product_Partition](VARCHAR(10)) AS RANGE LEFT
FOR VALUES (N'BigScreen', N'Computer', N'PoolTable')
The range in which a function is defined determines which boundary the partition values will belong.RANGE LEFT: Specifies that partition values will be less than or equal to the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than or equal to 'BigScreen' will be mapped to the first partition. Any products with names that are greater than 'BigScreen' and less than or equal to 'Computer' will be mapped to the second partition, and so on. RANGE RIGHT: Specifies that partition values will be less than the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than 'BigScreen' will be mapped to the first partition. Any products with names that are greater than or equal to 'BigScreen' and less than or equal to 'Computer' will be mapped to the second partition, and so on.
The Partition Function is created, so now I need to create the Partition Scheme. For the example, I will map all partitions to the primary filegroup. If I wanted to map the partitions to different filegroups, I would include the filegroup name ordinally in the filegroup list. Here's the script for the partition scheme:
CREATE PARTITION SCHEME [ps_Product_Scheme] AS PARTITION [pf_Product_Partition]
ALL TO ([PRIMARY])Now that I have the tables set up for the partitions, I can create the SalesHistoryArchive table and load data into it. At the end of the CREATE TABLE statement, the Partition Scheme uses a field name from the table to tell SQL Server how to map the data from the table to partitions. View the code sample in Listing A.
I'm going to make sure the partitions are working correctly by running some queries on the data. The following query returns all rows from the SalesHistoryArchive table and uses the $partition function to indicate which partition the returned row is a member of:
SELECT $partition.[pf_Product_Partition](Product), *
This query returns all partitions that are mapped to the SalesHistoryArchive table:
SELECT * From sys.partitions
WHERE OBJECT_NAME(object_id) = 'SalesHistoryArchive'
In a future article, I will explain how you can modify these partitions on a table so that new data can be added, and old data can be deleted very efficiently.
Tim Chapman is a SQL Server database administrator who works for a bank in Louisville, KY, and has more than 7 years of IT experience. He is also Microsoft certified in SQL Server 2000 and SQL Server 2005. If you would like to contact Tim, please e-mail him at firstname.lastname@example.org.
-----------------------------------------------------------------------------------------Get SQL tips in your inbox
TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!
Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.