Data Management

Build directory structures using SQL Server 2005

See how you can use the recursion and XML features in SQL Server 2005 to build a file location on the fly.

If you ever work with directory structures on the filing system, you know how challenging it can be to traverse through folders to find specific file(s). If you store this type of information structure in the database, you are even more aware of what it takes to retrieve the data. Writing queries to pull this information is sometimes difficult to achieve and inefficient. You can use the recursion and XML features in SQL Server 2005 to build a file location on the fly.

Example

This example searches for a document and builds the path to the document based upon a parent-folder to child-folder relationship in the database. One file will belong to one folder, which may be a child folder in a long lineage of parent folders. The ultimate goal is to provide a file to be searched for, and the process will build the location to the file.

I've seen file path locations stored several ways in the database, usually with the purpose of storing the location of a file to be pulled for a Web site. Most of the time, the full path to a file is stored in one database field, but I have also seen the location of a file "normalized" so that the past must be built when needed. My goal for this article is to solve the issue of building the path from the hierarchical structure.

The script below creates a Documents table and a Folders table. The Documents table stores the filenames and the folder that the document resides in. The Folders table stores the directory structure of one or more local or network drives. Most of the work in this example will involve traversing through this folder structure to build the path to the file.

IF OBJECT_ID('Documents','U') IS NOT NULL

DROP TABLE Documents

IF OBJECT_ID('Folders','U') IS NOT NULL

DROP TABLE Folders

IF OBJECT_ID('udf_BuildDocumentPath','FN') IS NOT NULL

DROP FUNCTION udf_BuildDocumentPath

CREATE TABLE Documents

(

        DocumentID SMALLINT,

        FolderID SMALLINT,

        DocumentName VARCHAR(255)

)

CREATE TABLE Folders

(

        FolderID SMALLINT,

        ParentFolderID SMALLINT,

        FolderName VARCHAR(255)

)

The code below adds data to our newly created tables. I am adding data for three documents, all of which are located in the same folder.

INSERT INTO Documents(DocumentID, FolderID, DocumentName)

VALUES(1,5,'SalesForecast2008.xls')

INSERT INTO Documents(DocumentID, FolderID, DocumentName)

VALUES(2,5,'SalesProjection.doc')

INSERT INTO Documents(DocumentID, FolderID, DocumentName)

VALUES(3,5,'SalesForecastPresentation.ppt')

INSERT INTO Folders(FolderID, ParentFolderID, FolderName)

VALUES(1,null, 'D:')

INSERT INTO Folders(FolderID, ParentFolderID, FolderName)

VALUES(2,1, 'Sales')

INSERT INTO Folders(FolderID, ParentFolderID, FolderName)

VALUES(3,2, 'Forecasts')

INSERT INTO Folders(FolderID, ParentFolderID, FolderName)

VALUES(4,3, 'Data')

INSERT INTO Folders(FolderID, ParentFolderID, FolderName)

VALUES(5,4, '2008')

GO

The script below creates the function that will build the full path to the file based upon the DocumentID in the Documents table. This function uses a recursive common table expression (CTE) to traverse through the directory structure, linking the child folder ID to the parent folder ID in the table. Once the set of records are found that comprise the full path to the document, the FOR XML PATH(‘') construct is used to "pivot" these values from values in different rows to values concatenated in the same row. From there, it is just a matter of returning the build path to the caller.

FOR XML Path() is one of my favorite features in SQL Server 2005 because it makes it so easy to take a list of column values from different rows and concatenate them together so that they are on the same row. It is an ideal tool for dynamically building SQL statements that require list of different values for use in an IN() statement.

CREATE FUNCTION udf_BuildDocumentPath

(

        @DocumentID SMALLINT

)

RETURNS VARCHAR(400)

AS

BEGIN

        DECLARE @ReturnPath VARCHAR(400)

;WITH DirectoryPathCTE(DocumentID, FolderID, ParentFolderID, DocumentName, FolderName, LevelNumber)

        AS

        (

        SELECT

               DocumentID, f.FolderID, ParentFolderID, DocumentName, f.FolderName, 0

        FROM

               Documents d

               INNER JOIN folders f on d.FolderID = f.FolderID

        WHERE

               DocumentID = @DocumentID

        UNION ALL

        SELECT

               DocumentID, f.FolderID, f.ParentFolderID, DocumentName, f.FolderName, p.LevelNumber + 1

        FROM

               Folders f

               INNER JOIN DirectoryPathCTE p on p.ParentFolderID = f.FolderID

        )

        SELECT @ReturnPath =

        (

               SELECT

                  FolderName + '' + CASE WHEN LevelNumber = 0 THEN DocumentName ELSE '' END

               FROM

                       DirectoryPathCTE p

               ORDER BY LevelNumber DESC

               FOR XML PATH('')

        )

        RETURN(@ReturnPath)

END

GO

Now that my function is built, I can call it for every document I have in my Documents table, and the path to the file will be built based upon the DocumentID in the Documents table.

SELECT dbo.udf_BuildDocumentPath(d.DocumentID)
FROM Documents d

Conclusion

Even if you never need to implement the example in this article, I hope you'll take away some ideas from the recursion and the FOR XML PATH clause to solve some tricky problems you may encounter in the future.

Tim Chapman a SQL Server database administrator and consultant who works for a bank in Louisville, KY. Tim has more than eight years of IT experience, and he is a Microsoft certified Database Developer and Administrator. If you would like to contact Tim, please e-mail him at chapman.tim@gmail.com.

-----------------------------------------------------------------------------------------

Get database tips in your inbox

TechRepublic's free Database Management newsletter, delivered each Tuesday, contains hands-on SQL Server and Oracle tips and resources. Automatically subscribe today!

About

Tim Chapman is a SQL Server MVP, a database architect, and an administrator who works as an independent consultant in Raleigh, NC, and has more than nine years of IT experience.

5 comments
Jakob.Flygare
Jakob.Flygare

Executing your batch outputs three times the same file path

ron
ron

If you need to do "insert into documents" for each file, then it will be a major task just to set it up for people who have tens, or hundreds of files in a typical user directory. There is little merit to implement your tool.

Krepenzis
Krepenzis

I wonder whether the given example can be used as is: tried to execute all the statements and when attempted to create the function the following error message occured: Msg 319, Level 15, State 1, Procedure udf_BuildDocumentPath, Line 9 Incorrect syntax near the keyword 'with'. If this statement is a common table expression or an xmlnamespaces clause, the previous statement must be terminated with a semicolon. Is anything I am doing wrong there? Thanks. Alexei

Guitarzan8
Guitarzan8

This article wasn't what I expected. I thought it would be about dynamically building a table of document names and locations based on an existing file structure, a feature that I currently need. I have an Intranet with a view images feature. The images to "catalog" are dumped in a folder by users. I have an access database that builds tables the Intranet uses, using VBA. I want to convert it SS2k5 for stability, conformation to standards and performance. Instead this article just retrieves records from tables where the names and locations already exist. There is no "building" occuring. Sorry, maybe I've been watching too many "American Idol" shows and I'm critical or my expectations were too high.

chapman.tim
chapman.tim

Please put a ; before WITH....I'll have the article corrected. Good catch! Tim

Editor's Picks