Data Management

Creating variables in SQL Server 2005 Integration Services

Creating and consuming variables in SQL Server 2005 Integration Services isn't always easy. But it's a powerful capability worth learning if you're planning to make the most of this powerful ETL tool.

SQL Server 2005 Integration Services (SSIS) is really well thought out in terms of how it supports user-created variables inside the packages you create. It is, however, not always that intuitive how to create, consume, and manipulate those variables. So I'm going to take my next couple of blog posts to document what I've learned.

Why do you need variables?

I've found variables useful in a number of places -- perhaps most useful is taking data created from one task and sharing it with one or more other tasks in the package. For example, there is a row-count task in SSIS that you can use to count how many records get read in from a data source such as a flat file. You might want to compare that count later on with how many records get submitted into the database. A variable can store that value from the row-count task so that you can retrieve it later on in the package.

Creating variables

To create variables, click the Variables tab. By default, it's in the group on the left-hand side of Visual Studio next to the Toolbox tab. Then click the Add Variable button, which is the first button on the left inside the Variables window. Type in a name and then select the Data Type (i.e., string or numeric). If you already know a value for it, you can enter that as well when creating the variable.

Creating variables

You'll notice there's one other field in the Variables window, but it's one you can't modify directly. The Scope field indicates where your variable is scoped, whether it's visible at the package level or only within a particular task. This is kind of tricky -- you have to make sure you have the right scope selected in the Design window before clicking the Add Variable button, which determines the scope for the newly added variable. If you want the variable to be at the package level (i.e., visible to everything in the package), you need to make sure you don't have any of the tasks within the package selected when you add the variable.

In my example, I create two variables: one at the package level and one that's only visible within the Data Flow Task that I've added in the package. (In my next post, I'll go into more detail about using variables scoped to individual tasks. It turns out there are some pretty nifty uses for that, especially if the task is running another SSIS package.) Now I want to talk about how to save a value into a package-level variable for later consumption.

For this example, I create a simple data flow task. It has a database connection to the AdventureWorksDW sample running on my local SQL Server instance. There's an OLE DB Source that reads one of the tables in the database (it doesn't matter which table, but for the purpose of this discussion, I picked the DimGeography table). Finally, there's a Row Count task. (If this were a real project, I would have renamed all of the tasks something more explanatory. I've left them with the default names for simplicity and clarity.)

Tasks

Now double-click on the Row Count task to bring up its editor. At the bottom of that first edit screen, there's a Custom Property called VariableName -- that's where you enter the variable you'd like to use as the storage place for this task's output. It won't let you enter a variable that doesn't already exist, and, yes, it is case-sensitive. I input the name of my variable, which I called RowCount.

Custom Property

To make sure everything's working, I run the package in debug mode (hit [F5] or go to Debug | Start Debugging). If everything is working, the task boxes should all turn green to indicate they ran successfully. You'll see that it counted 655 rows.

Row count

Next time, I'll go beyond the basics and dive into some ways of using these variables, including from script tasks and passing from one package to another.

Additional resources about SSIS on TechRepublic

3 comments
mberman753
mberman753

I have a question: My goal is to read the first row of a csv file, take the company field from that data, purge records from the database based on that value, and then continue to the next steps of the data flow. In SQL 2000, on a Data Driven Query Task, on the Options tab, you could set First Row and Last Row. By setting both of those to '1', only the first row was read from the incoming csv file. I am migrating to SSIS, but i don't see the 'Last Row' feature. I found the 'First Row', in the Connection Manager for the csv file, under Properties, "Header Rows to Skip". I'd appreciate any input..

tish.whalen
tish.whalen

I have used this in the past. However, I find myself needing to output the variable rowcount to a flat file for audit purposes. Any suggestions?

Justin James
Justin James

Do you use SSIS, or are looking to use it? Or do you prefer to completely "roll your own" integrations with completely external code? J.Ja

Editor's Picks