Data Management

Convert a string array to tabular data in SQL Server

Many DBAs using SQL Server 2000 or SQL Server 2005 are still not very comfortable parsing XML data to get an array of values. Check out this approach for converting arrays of string values to a dataset in SQL Server 2000 and later.

If you use SQL Server for any considerable amount of time, you'll likely encounter a situation in which it would be very useful to pass an array of values into a function or procedure and handle data based on that array of values.

In SQL Server 2008, you can pass tables as parameters. In SQL Server 2005, you can pass around values in such a way using the native XML data type. You can even accomplish it in SQL Server 2000 by using OPENXML and using XML data similar to the way you'd use it in SQL Server 2005.

Many DBAs using SQL Server 2000 or SQL Server 2005 are still not very comfortable parsing XML data to get an array of values (and no one is using SQL Server 2008 yet). I'll show you an approach I use for converting arrays of string values to a dataset in SQL Server 2000 and later.

When to use this approach

I'm not a huge fan of this approach of passing parameters. I feel that SQL Server has more than adequate native constructs for handling such situations. However, I have fielded literally hundreds of questions regarding this scenario in various SQL Server forums, so it is a generically usable approach to passing an array of data.

An Array List

For my example, the Array List is a string of values separated by a common delimiter. This a valid array list for the example:

DECLARE @ArrayList VARCHAR(MAX)
SET @ArrayList = 'tim,zach,chris,wendi,brad'

The goal is to take that string of values and pivot them so that all of the values occur in the same row of data; then you can join the data to another table, etc. To do this, I will pass this list to a table-valued function, and the function is going to take care of the pivoting work for me. Below is the code to my pivot function:

CREATE FUNCTION [dbo].[udf_PivotParameters]

    (

      @ParamaterList VARCHAR(MAX),

      @Delimiter CHAR(1)

    )

RETURNS @ReturnList TABLE

    (

      FieldValue VARCHAR(MAX)

    )

AS BEGIN

    DECLARE @ArrayList TABLE

        (

          FieldValue VARCHAR(MAX)

        )

    DECLARE @Value VARCHAR(MAX)

    DECLARE @CurrentPosition INT

 

    SET @ParamaterList = LTRIM(RTRIM(@ParamaterList))

        + CASE WHEN RIGHT(@ParamaterList, 1) = @Delimiter THEN ''

               ELSE @Delimiter

          END

    SET @CurrentPosition = ISNULL(CHARINDEX(@Delimiter, @ParamaterList, 1), 0)

    IF @CurrentPosition = 0

        INSERT  INTO @ArrayList ( FieldValue )

                SELECT  @ParamaterList

    ELSE

        BEGIN

            WHILE @CurrentPosition > 0

                BEGIN

                    SET @Value = LTRIM(RTRIM(LEFT(@ParamaterList,

                                                  @CurrentPosition - 1))) --make sure a value exists between the delimiters

                    IF LEN(@ParamaterList) > 0

                        AND @CurrentPosition <= LEN(@ParamaterList)

                        BEGIN

                            INSERT  INTO @ArrayList ( FieldValue )

                                    SELECT  @Value

                        END

                    SET @ParamaterList = SUBSTRING(@ParamaterList,

                                                   @CurrentPosition

                                                   + LEN(@Delimiter),

                                                   LEN(@ParamaterList))

                    SET @CurrentPosition = CHARINDEX(@Delimiter,

                                                     @ParamaterList, 1)

                END

        END

    INSERT  @ReturnList ( FieldValue )

            SELECT  FieldValue

            FROM    @ArrayList

    RETURN

   END

The function accepts the list of values, along with the delimiting character to separate the list. Then the function loops through the array list, inserting everything between the delimiting characters into a table variable, which is eventually returned as the output from the function. Because this is a table-valued function, you need to return the values from a SELECT statement. This code snippet shows you how:

DECLARE @ArrayList VARCHAR(MAX)
SET @ArrayList = 'tim,zach,chris,wendi,brad'
SELECT * FROM dbo.udf_PivotParameters(@ArrayList, ',')

Performance

I suggest that you use this approach sparingly if possible because the performance is going to drastically degrade as the values in the list increases.

Tim Chapman a SQL Server database administrator and consultant who works for a bank in Louisville, KY. Tim has more than eight years of IT experience, and he is a Microsoft certified Database Developer and Administrator. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

9 comments
rhshani
rhshani

Daer Sir Hope u r fine, There is a problem that not solved by me, pls help........ I hav a database in MSSQL server 2008, I hav export into MSSQL serbver 2000, but it removes all constraints from the tables.... pls tell the way how i convert all database from sql 2008 to sql server 2000 without removing constraints and all views and procedures also convert pls sove it i shall be thanks full to u............... . thanks regards Zeshan Habib rhshani@yahoo.com

anrpgpgmr
anrpgpgmr

Very nice. Thanks. It does work VRC. Perhaps just passing it a string of strings rather than constructing such a complex model. Such as: DECLARE @ArrayList VARCHAR(8000) SET @ArrayList = 'tim~~~1~~~first,zach~~~2~~~second,chris~~~3~~~third,jerry~~~4~~~fourth,wendi~~~5~~~fifth,brad~~~6~~~sixth' SELECT * FROM dbo.udf_PivotParameters(@ArrayList, ',') Then in your procedure or code, iterate the recordset and perform this function again to glean the individual field data. So iteration one is for row data: SELECT * FROM dbo.udf_PivotParameters(@ArrayList, ',') and second iteration is for field data: SELECT * FROM dbo.udf_PivotParameters(@ArrayList, '~~~').

VRC
VRC

It Does not work: Below are the bugs after fixing some:--- sp_GetDirList, receives a list of Courses from either MH-60R or MH-60S --IF EXISTS(SELECT * FROM dbo.sysobjects WHERE name = N'sp_GetDirList' AND Type = 'P') --DROP Procedure sp_GetDirList --GO --CREATE PROC dbo.sp_GetDirList --@DirList varchar(700) --= 'ABC'',''DEF'',''GHI''' --comma delimited list of codes or data ie: '''ABC'', ''DEF'', ''GHI'' ---SET @ArrayList = 'tim,zach,chris,wendi,brad'; -- the goal is to tke a string of values and -- pivot them so that all the values occur in the same row of data -- to this the list must be pass to a table-valued function below: CREATE FUNCTION [dbo].[udf_PivotParam] ( @ParamList varchar(700), @Delimeter char(1) ) RETURNS @ReturnList TABLE ( FieldValue varchar(700) ) AS BEGIN DECLARE @ArrayList varchar(600) ---,@ParamList varchar(600) SET @ArrayList = 'tim,zach,chris,wendi,brad' DECLARE @ArrayTable TABLE ( FieldValue varchar(600) ) Declare @Value varchar (600) Declare @currentPosition int SET @ParamList = LTRIM(RTRIM(@ParamList)) + CASE WHEN RIGHT(@ParamList, 1) = @Delimeter THEN '' ELSE @Delimeter END SET @CurrentPosition = ISNULL(CHARINDEX(@Delimeter, @ParamList, 1),0) IF @CurrentPosition = 0 INSERT INTO @ArrayList ( FieldValue) Select @ParamList ELSE BEGIN WHILE @CurrentPosition > 0 BEGIN SET @Value = LTRIM(RTRIM(LEFT(@ParamList, @CurrentPosition - 1))) --make sure a value exists between the delimeters IF LEN(@ParamList ) > 0 AND @CurrentPosition

aikimark
aikimark

Tim, Nice example. I think your performance restriction is due to the iterative recreation of the @ParamaterList parameter/variable. Like concatenation, it involves memory allocation, deallocation and copy operations. I'm not sure of the performance effect of doing these operation on a parameter versus a local variable, but you could easily run a comparative performance test. I think you could easily tweak this code to be a very flat performer...eliminate the removal of the items from @ParamaterList within the loop. Instead of always looking for the delimiter in the first position, you would move BOTH the starting position and the delimiter positions. If you want to simplify this logic, you might concatenate a trailing delimiter (if it doesn't have one) before you begin looping.

AryMir
AryMir

One need to add the following If Len(@ParamaterList)> 0 and CurrentPosition=0 Begin INSERT INTO @ArrayList ( FieldValue ) SELECT @ParamaterList End just before INSERT @ReturnList(FieldValue) statement or the last array item will not be retured or if the array has only one element nothing is returned. Ary

chapman.tim
chapman.tim

You're a C++ programmer, aren't you? ;) It probably could be tweaked a bit, but in the end, you still have to loop through the parameter list that is passed in and put it into a temp table. So, it will grow increasingly slower when the number of items go up. If you don't mind, drop me a line at chapman.tim@gmail.com...I have a couple of questions for you.

rporrata
rporrata

Although overkill for this senario a C# CLR proc can do the work as well and return the result set as a pipe stream. This can take card of the larger data sets if they are too painful and/or involve a bit more processing. They work very well in my experience and use them in reports as well analytics. They may not be a easy to promote as T-SQL procedure but something we should consider as DBAs/SQL Programmers to help us with our day-to-day duties.

aikimark
aikimark

@Tim No. I try to avoid C++ whenever possible. While it's true that both require iteration, longer strings add significant (noticable) overhead caused by memory and string reconstruction operations with the original algorithm. This overhead decreases with the length of the string.

rporrata
rporrata

Tim thanks for your blog. It great to hear what other are doing out there.