Data spring cleaning tips for SMBs

Andy Moon shares his seven-step process for getting rid of unnecessary files in bulk without the danger of losing any data. Tell us about your data spring cleaning best practices.

The Oracle Application Users Group released the results of a study last week that revealed 87% of the respondents blame data growth for their performance issues. As I opined recently, I think it could be very good for IT if users culled data that they don't need in order to reduce stress on storage and backup infrastructures. A recent PC World article suggests that users who don't do such cleaning regularly may be costing their companies a lot of money.

On the infrastructure side of things, there are a lot of technologies aimed at trying to help organizations spend less on storage. Deduplication technologies remove redundant information on storage devices so that things like operating system files or presentations that exist on many users' home drives will only take up space once. Multi-tiered storage allows the most critical data to be stored on high-speed, expensive hardware while less crucial or less frequently used data resides on slower, cheaper hardware.

Unfortunately, the IT-centered solutions leave us with the same problem: Data growth is explosive and nearly unchecked in all industries. Granted, there are many good reasons to keep a lot of this data, including regulatory requirements, files that need to be quickly accessible, and files that are accessed frequently. However there are many files, particularly on users' desktop PCs, that are simply irrelevant, old garbage that should be treated as such. In the business world, there is too much work to do for us to spend the necessary time cleaning our data.

In order to be a help, I am posting my strategy for getting rid of unnecessary files in bulk without the danger of losing any data. This is what I educate my users to do, and we have been able to forestall increasing server space simply through reducing file volume on our servers.

  1. Get a CD or DVD burner for your PC.
  2. Burn all of your data files to the burner you acquired.
  3. Go through the media you just burned to make sure the burn was successful.
  4. Delete everything you just burned from your PC.
  5. Browse through the media you just burned and copy back to your PC only the files you absolutely know you will need in the next day or two.
  6. Keep the media in your drive for a few weeks and, when you need a file that is on the media, copy it back to your PC.
  7. Label the disc with the date (I do this yearly, so I label mine "Clean - 2009").

For Outlook, the process is a little different. I keep a year of email in a PST on my desktop. Every year, I take the previous year's PST, archive it to CD or DVD, and close the PST in Outlook. So, at the end of 2009, I archived all of my 2009 email, created a new PST for 2010, burned my 2008 email, and then deleted the 2008 PST.

Using this strategy, I still have access to my old email and files if I need them, but they aren't taking up space on my hard drive, a network drive, or any backup medium. As a result, my inbox is a little over 13 MB, last year's PST is just under 500 MB, and My Documents is under 50 MB. I am also secure in the knowledge that if someone shows up needing an email or file from two years ago (when I started my current job), I have access to it.

What data spring cleaning best practices do you recommend? Do you think expecting users to clean up their own data is realistic? Share your tips and your thoughts in the discussion.

TechRepublic's Servers and Storage newsletter, delivered on Monday and Wednesday, offers tips that will help you manage and optimize your data center. Automatically sign up today!