When redesigning a database, converting the data from the existing system is often promised but rarely done. It’s often listed as a project task, but when the time for rollout approaches and the pieces are pared, it’s usually put on the back burner. The excuse is “we can always do it afterwards”—probably during that mythical free time that exists in the future. There’s an easy, inexpensive solution, however.
Converting data is hard
If there were a simple one-to-one correspondence between the old system and the new, there would be no reason to build a new database. In any significant redesign, all the data will be mapped. This verifies that no data elements are lost and provides a framework for testing the new data models. However, after the new design is finished, the old data won’t fit without significant massaging. New keys between the new tables must be created. Usually data must be “invented” to fill in mandatory columns in the new schema. It’s tricky, time-consuming, and, with the requisite made-up data, often suspect.
Yet not converting data is costly
There was a reason to gather all that data in the first place. Throwing it away is akin to throwing away all the previous time and money spent on it. There is also a cost in morale. The people who see that old data get tossed may wonder why they are bothering to collect new data. After all, it will just be trashed in a few years. And developers will wonder if the new system they’re building is really worth the effort.
At one former job, we lost a contract with the mainframe hosting our database. We were given 30 day’s notice. We were in the middle of converting to Oracle on PCs, so the bosses decided to rush the release out before we lost the mainframe. There wasn’t even time to pull data off into other media. We ended up with a warehouse full of tapes that were completely unreadable without renting expensive mainframe time, so that data was useless. The project never recovered the stature it lost in users’ minds. There were good reasons for every step the managers took, but poor communication and a dismal result caused persistent funding problems for that project.
Saving unconverted data on old media is no answer
Saving your old data without converting it sounds like a good compromise, but it isn’t quite as easy as that. Both software and media age quickly. If you leave data on old media, such as 5 1/4 in. floppies or tapes that can’t be read on current tape drives, you really haven’t saved anything. CDs and even DVDs will probably end up as antiques sooner than we imagine. And even if your hardware can access the old files, will your current database software open them? Probably not. There’s cost to consider, too. Keeping dBase II files on Bernoulli drives is actually more expensive than converting them.
Saving your data without converting it is the way to go
One quick solution is to move the entire schema of the old database into the new one. Don’t convert any of the data. Don’t rename any of the tables or columns or anything. Don’t try to tie it to the new database with links to the new keys. Create a completely separate area for Old_Data and stuff everything in there. The beauty of this system is that, as your current data is upgraded, your ancient data will be too. When the entire database is shifted to new hardware, your old data is carried along with it. There’s no need to keep an old system working just to have access to old data.
None of the existing forms and reports will work
But you don’t need to add anything, so who needs forms? And if you document the old schema—the primary keys and foreign keys of each table—then users can actually go in and gather data using regular SQL queries. Saving the source code of old reports so someone can go in and look at the SQL is a good idea, but keep the source with the old data so it doesn’t get lost on some microtape in an unreadable format. To get better reports, you could investigate converting the existing reports. If it’s as easy as using the new software’s convert function, then do it. If it’s too complicated, you’re getting into time and money that could probably be better spent elsewhere.
A side schema for old data is a cheap solution
Old data that isn’t run through a costly conversion gets lost. Moving your existing data into a side schema of your new database means it won’t be converted, but it will be saved. It may never be used at all, but this kind of “conversion” is fairly cheap. And you’ve got the data right there where you need it when you finally find the time to convert this valuable resource. Of course, by then you’re probably getting ready for the next database redesign.