Data Management

Creating table partitions in SQL Server 2005

The typical usage scenario for a partitioned table is one that is very large and is experiencing degrading performance over time. Tim Chapman walks you through the steps of partitioning a large table in SQL Server 2005.

In the database world, partitioning is the horizontal splitting of rows in a table or index defined on a table in separate, smaller logical units. In fact, every table in SQL Server 2005 has at least one partition. You are essentially breaking one table or index up into smaller tables or indexes that act as one table or index. This can greatly increase performance because you are querying several smaller tables rather than one much larger one. The advantages increase when the server has multiple processors, which allow the partitions to be queried in parallel in the same query, and when the partitions are spread across multiple file groups.

Prior to SQL Server 2005, the process of partitioning tables was much more involved for DBAs; the partitioned tables were actually user tables in the database that were typically "unioned" together in a view to simulate one large table. From there, DBAs could define INSTEAD OF triggers on the view to manipulate the data in the underlying tables. In SQL Server 2005, administration is much easier because the database engine maintains the partitions in the background, allowing you to focus on other tasks.

It's not practical to partition every table in your database. The typical usage scenario for a partitioned table is one that is very large and is experiencing degrading performance over time. I'll walk you through the steps of partitioning a large table.

Setting up the partitioning example

My example demonstrates how to partition a table that stores archived data to increase query performance. Before diving into the code for the example, here's a brief overview of the moving parts that make up a partitioned table.

Partition Functions

When a table is partitioned, it is broken horizontally into smaller table partitions that are used together to assimilate the whole table. To do this, SQL Server has to know how to break the table into smaller parts. This is where a Partition Function comes into play. A Partition Function is the logic that maps the rows of a table or index into predefined partitions based on the values in a field of data. This will be the first partition object I create in the example.

Partition Schemes

You tell SQL Server how to horizontally split a table with a Partition Function. You also need to indicate how the partitions will be stored in the database. In SQL Server 2005, you map a table partition to a filegroup, which is basically a logic grouping that the database uses to store its data files and transaction log files. Every database has at least one filegroup named the Primary filegroup and additional filegroups for administrative and performance purposes. In a Partition Scheme, you can define that all partitions are mapped to the same filegroup, or you can use the scheme to split up the partitions across filegroups. The advantage is that, when the filegroups are split across individual disks, SQL Server will be better equipped to use resources when retrieving data. This advantage becomes even greater when you are running queries against the partitioned tables on a server with multiple processors.

Creating a partitioned table

Step one is to define the partition function that you will use to map out the partitions in the table. In the Partition Function defined below, I use three partitions, one for each product type in the SalesHistoryArchive table, which I will define later. These partitions will essentially break the SalesHistoryArchive archive table into three different tables that are automatically maintained by SQL Server.

CREATE PARTITION FUNCTION [pf_Product_Partition](VARCHAR(10)) AS RANGE LEFT

FOR VALUES (N'BigScreen', N'Computer', N'PoolTable')

The range in which a function is defined determines which boundary the partition values will belong.

RANGE LEFT: Specifies that partition values will be less than or equal to the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than or equal to 'BigScreen' will be mapped to the first partition. Any products with names that are greater than 'BigScreen' and less than or equal to 'Computer' will be mapped to the second partition, and so on. RANGE RIGHT: Specifies that partition values will be less than the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than 'BigScreen' will be mapped to the first partition. Any products with names that are greater than or equal to 'BigScreen' and less than or equal to 'Computer' will be mapped to the second partition, and so on.

The Partition Function is created, so now I need to create the Partition Scheme. For the example, I will map all partitions to the primary filegroup. If I wanted to map the partitions to different filegroups, I would include the filegroup name ordinally in the filegroup list. Here's the script for the partition scheme:

CREATE PARTITION SCHEME [ps_Product_Scheme] AS PARTITION [pf_Product_Partition]

ALL TO ([PRIMARY])

Now that I have the tables set up for the partitions, I can create the SalesHistoryArchive table and load data into it. At the end of the CREATE TABLE statement, the Partition Scheme uses a field name from the table to tell SQL Server how to map the data from the table to partitions. View the code sample in Listing A.

I'm going to make sure the partitions are working correctly by running some queries on the data. The following query returns all rows from the SalesHistoryArchive table and uses the $partition function to indicate which partition the returned row is a member of:

SELECT $partition.[pf_Product_Partition](Product), *

FROM SalesHistoryArchive

This query returns all partitions that are mapped to the SalesHistoryArchive table:

SELECT * From sys.partitions

WHERE OBJECT_NAME(object_id) = 'SalesHistoryArchive'

Next time

In a future article, I will explain how you can modify these partitions on a table so that new data can be added, and old data can be deleted very efficiently.

Tim Chapman is a SQL Server database administrator who works for a bank in Louisville, KY, and has more than 7 years of IT experience. He is also Microsoft certified in SQL Server 2000 and SQL Server 2005. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

-----------------------------------------------------------------------------------------

Get SQL tips in your inbox

TechRepublic's free SQL Server newsletter, delivered each Tuesday, contains hands-on tips that will help you become more adept with this powerful relational database management system. Automatically subscribe today!

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

7 comments
philip.grogan
philip.grogan

Are Primary Keys not required when using partitioning? Marking the SaleID column as a Primary Key fails, (unless you add the partitioning column, which makes the Primary Key constraint inappropriate). I cannot find any examples on MSDN, or other SQL sites, that use primary keys or Unique constraints on the partitioned tables. I'm probably missing something very subtle...

gavinp
gavinp

this is the best read i ever had

MikeSQLDBA
MikeSQLDBA

You wrote: RANGE RIGHT: Specifies that partition values will be less than the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than or equal to ?BigScreen? will be mapped to the first partition. Any products with names that are greater than ?BigScreen? and less than or equal to ?Computer? will be mapped to the second partition, and so on. But this is word for word the same text as for RANGE LEFT. It should read: RANGE RIGHT: Specifies that partition values will be less than the values defined in the Partition Function. In the Partition Function used above, four partitions are created. All products with names that are less than ?BigScreen? will be mapped to the first partition. Any products with names that are greater or equal to ?BigScreen? and less than ?Computer? will be mapped to the second partition, and so on.

philip.grogan
philip.grogan

I think I've worked it out: You can have a Primary Key on partitioned tables, but you must also have a Clustered Index including the 'primary key' column and the 'partition' column. 1) Create clustered Index: CREATE CLUSTERED INDEX IDX_SalHistArch ON SalesHistoryArchive(SaleID ASC, Product) ON [ps_Product_Scheme](Product) 2) Create non-clustered Primary Key ALTER TABLE SalesHistoryArchive ADD CONSTRAINT PK_SalHistArch PRIMARY KEY NONCLUSTERED (SaleID) ON [PRIMARY] So now the clustered index goes across the filegroups with the partitioned data and the primary key constraint lives separately on the filegroup [PRIMARY]. That's what I wanted anyway...

chapman.tim
chapman.tim

Thanks for noticing. I seen it this morning when it was posted so I had it corrected. Thanks for catching it though. Tim

thisisfutile
thisisfutile

I noticed that both defs were the same too, and because I've not been exposed to table partitioning, I just assumed I was missing something. Thanks for the post Tim, I'm going to look into table partitioning for SQL 2000 because we've got software that is built around our SQL tables and they didn't incorporate any archiving features for documents. We've been using the system for 2 years and these tables are getting rather large and I'm starting to blame some of our look-up (search) problems on the table size. Again, thanks for the post, Tim (and for the correction, Mike)

chapman.tim
chapman.tim

It is doable in 2000, but requires a bit more admin overhead, because the tables must be maintainted seperately and unioned together with views. However, if you are seeing performance degredation it is definetly worth taking a look at it.