Let me tell you a story that illustrates the truth in the maxim, “Garbage in, garbage out.” I received a call from a consulting client who wanted me to stop by and see if I could help “clean up” a database.

Someone in the client’s organization decided to build a database. She designed a data entry form, photocopied it, and spent several weeks doing research and filling in the blanks on the paper forms. Then she hired a bunch of temporary employees to enter the data into a spreadsheet.

Unfortunately, the project manager violated the most basic, commonsense rules of managing data entry, and she wound up with a mess. If you find yourself in charge of building a database, follow these simple guidelines to make sure you don’t have to hire someone like me to come in and clean it up for you.
Subscribe to Jeff Davis’ Help Desk TechMail now, and you’ll get a bonus of Jeff’s picks for the best Web sites for IT support professionals—exclusively for TechMail subscribers.
Of batches and control sheets
I cut my database teeth working for a law firm, creating and maintaining hundreds of large-scale databases for various cases. In that setting, I enjoyed the luxury of being able to employ second-shift word processing operators to key the raw data.

I learned early on that the best way to enter data is in batches. If you have a stack of data entry forms, you create a control sheet that lists the following information:

  • Batch number
  • Date batch entered into system
  • Initials of person entering batch into the system
  • Number of records in the batch
  • The first and last record numbers (from the database) for this batch

Many developers go so far as to set up a field for “batch number” in the database itself. However, whether or not you store the batch number electronically, tracking the “batches” of records entered is the best and only way to audit the data or troubleshoot problems.

After the records are entered, you fill out the control sheet, attach it to the top of the stack of data entry sheets, and store those paper forms in a safe place. That way, if your database is accidentally damaged (and not backed up), you always have the option of rekeying your records.

Training and proofreading
Of course, using a control sheet does no good whatsoever if you don’t properly train your people. Take the time to “walk” a data entry clerk through a dozen or so records. Make sure the clerk understands how the software works, including:

  • How to open the database
  • How to enter records correctly (including details such as how to enter dates, names, abbreviations, and so on)
  • How to save the records and close the database
  • How to print the records for proofreading, if applicable
  • What to do with the control sheet and the “batch” of forms

The alternative: A big, fat mess
My consulting client did so many things wrong, it makes me want to cry. The person who (mis)managed this database project:

  • Had never heard of batches. She just kept two stacks, “done” and “need to be done.”
  • Didn’t train the data entry clerks. Some of the clerks were throwing away the forms after they entered the records! Others didn’t know how to save their work, and so hundreds of records had to be entered twice.
  • Didn’t enforce data validation rules. Even if you don’t have the option of customizing your software, you must communicate with the people entering your data. Some of the clerks entered all of the text in uppercase letters. Some of the clerks used abbreviations, while others didn’t. Many fields were left blank that should have been “required” entries.
  • Didn’t design the database properly in the first place. I kid you not, folks, this lady set up four “address” fields, and she had the clerks enter address one, address two, city, state, and ZIP code all in the same field!

I am astonished at how na�ve this client’s project manager was about the rudiments of building a database. Maybe I shouldn’t complain. I am, after all, charging the client a pretty penny to come in behind this bozo, clean up the database (in part by parsing the address records into separate fields), and create some reporting tools.

That said, I hate that so much of this client’s resources were wasted because the project manager didn’t have a clue about building a database. Don’t let it happen in your shop.
Have you had to clean up any database messes lately? Please post a note below or follow this link to write to Jeff.