I’ve been involved in several conversion projects, in which mission-critical data needed to be moved from one system to another. While it’s difficult enough having to match data formats in two software packages, the difficulty increases when the two applications are hosted on different databases (i.e., SQL Server and Oracle) or on different hardware platforms (Windows 2000 and UNIX).
Such projects require that consultants be familiar enough with both environments to write, test, and debug conversion programs to unload data from one system and load it into the other.
The rise of data warehousing has spurred a solution to this problem: Extract, Transform, and Load (ETL) tools. Microsoft’s support for data warehousing in SQL Server (version 7.0 and later) includes a set of ETL tools called Data Transformation Services (DTS), which allows you to copy data between data sources without having to know how to use either system’s tools.
For consultants, this makes moving data much easier: The coding is already done; all you need to do is specify what data to move and what transformations to apply to it. In this article, I’ll show you how the DTS tools work and how you can use them to make data migration projects easier.
A set of DTS specifications is called a package. It includes Connections, Tasks, and Workflow.
- Connections are the sources and destinations of the data being moved by the package. They identify the databases to be accessed and which login credentials to use for each.
Connections read from and write to multifile databases, such as SQL Server and Oracle; single-file databases, such as Access, FoxPro, dBASE, and Paradox; and even individual data files, such as Excel spreadsheets and text files.
Any database with an Object Linking and Embedding Database (OLE DB) provider or open database connectivity (ODBC) driver can be accessed via DTS connections.
- Tasks are the actions that the package performs. Some tasks require both source and destination connections; others require only a destination connection; and some operate without any connections at all (or define their own connections).
Among the built-in tasks are the Transform Data Task for copying data; the Execute SQL Task to select, update, or delete data or to create or drop tables; the Execute Process Task to execute an operating system command or batch file; and the Send Mail Task to send notifications. Additional custom tasks can be added.
- Workflow indicates the order in which tasks are to be performed. Tasks in a DTS package can execute sequentially or in parallel and can be conditioned on whether a previous task completed successfully. One package can even call another as part of its workflow, leading to modular design. The three workflow options are On Success, On Failure, and On Completion (both Success and Failure).
DTS Import/Export wizard
To perform a simple copy from one source to one destination with minimal conversion, use the DTS Import/Export wizard. The wizard creates a package containing a single Transform Data Task and its two connections.
The easiest way to start the wizard is from Enterprise Manager (see Figure A below). Right-click on a database name in the Navigator pane, select All Tasks, and select either Import Data or Export Data. This tells the wizard that the database you clicked is the source (for exports) or the destination (for imports).
|SQL Server Enterprise Manager|
The wizard will then proceed through the following screens:
- Choose A Data Source: Here, you select the OLE DB or other connection from which DTS will copy data.
- Choose A Destination: Here, you select where the data will be copied to.
- Specify Table Copy Or Query: You can copy one or more tables by selecting them from a list or by writing a SQL query to select the data to be copied. If both connections are SQL Servers, a third option appears for copying both objects and data.
- Select Source Tables And Views: Here, you match a destination table for each source table or query you specified. Click on the Transform button to proceed to the next screen.
- Column Mappings And Transformations: You can change the names of the output columns, redistribute the data, and even code VBScript or JScript programs to manipulate the data.
- Save, Schedule And Replicate Package: The final screen lets you save the package and schedule it for later execution.
The DTS Import/Export wizard can also be run from the DOS command line as Dtswiz.exe. If you supply some of the answers via command-line parameters, the wizard will skip over those screens, shortening the dialog.
(Tip: Save time building complex packages by using the wizard to create the first two connections and a single Transform Data Task. Then, edit the resulting package in DTS Designer to add the remaining tasks and connections.)
For complex jobs, use DTS designer
DTS Designer is a graphical editor that can be used to create more elaborate conversion jobs. The interface consists of a toolbox and work area. You can drag items from the toolbox onto the work area, define their properties, and then connect them to create a workflow.
The Connection toolbox displays 11 icons that correspond to the data sources supported by DTS. Drag an icon onto the screen and DTS will prompt you for connection information, such as username and password, or file location in the case of Access and Excel. Give the connection a name, which labels the icon on the screen.
The Task toolbox has 17 icons that represent the types of actions you can program in DTS. Drag a task icon into the working area to set its properties. For the Send Mail task, for example, you enter the recipient of the e-mail, subject line, and message text.
Finally, connect the tasks together to create a workflow. Click on two task icons, in the order in which they should be executed. From the Workflow menu, select one of the following conditions:
- On Success executes the second task only if the first one finishes normally.
- On Failure executes the second task only if the first one returns an error.
- On Completion executes the second task after the first, regardless of outcome.
For example, you could create an Execute SQL Task to create a new table in the destination database. If it succeeds, the package would execute a Transform Data Task to load the new table. If the Execute SQL Task fails, the package would send an e-mail instead to notify you of the problem.
(Tip: DTS also includes tasks designed specifically for copying database objects between SQL Server databases. These tasks copy not only the data but the metadata (descriptions of the objects and their properties) as well. Logons, stored procedures, tables, and even entire databases can be copied via DTS using these tasks in a package.)
Store DTS packages for on-demand or scheduled execution
Whether you use the Import/Export wizard or DTS Designer to create a package, you can run it immediately or schedule it for later execution. Scheduling options include:
- Daily, or every “n” days. You can also schedule daily jobs to run several times per day.
- Weekly, or every “n” weeks, on any combination of days (check boxes for the days the package is to run).
- Monthly, or every “n” months, on a particular day, week, or date.
DTS packages can be stored in the local SQL Server’s msdb database for later editing. They can also be stored in the type of Structured Storage files used by Microsoft’s Component Object Model (COM) or as Visual Basic projects. Once stored, they can be edited using DTS Designer.
When the big moment comes to cut over to the new system, you can simply execute each tested and saved DTS package in turn. For situations when you’ll be running two systems in parallel for a period of time, you can schedule the DTS package to run on a regular basis to keep the new system updated from the old.
SQL Server DTS is a powerful and flexible set of tools for moving data between systems. The real surprise is that both the source and destination can be systems other than SQL Server. Now conversions between otherwise incompatible, flat file systems can be tackled easily, as can extracting data from systems with no export tool of their own.
The odds are good that your client is already licensed for one or more copies of SQL Server, either by itself or as part of Microsoft BackOffice. On larger projects, including a license for SQL Server in the project budget may very well pay for itself in the time saved writing, testing, and debugging separate conversion programs.
Dealing with data migration
If you’re a consultant, what types of data migration projects have you done, what problems did you have, and how did you solve them? Tell us about your experiences. Post your comments in the discussion or send us an e-mail.