Image: Facebook

Forrester reports that up to 73% of big data goes unused in organizations, yet very little big data is discarded. The main reasons for retaining old, but seldom or never accessed, data are:

  • Concerns that new types of analytics, such as long-term trending, might make this all-but-forgotten data necessary.
  • The possibility of litigation, which requires the ability to retrieve old documents and emails that might go back years.

The challenge comes in managing the storage for this data, which when largely out of sight is also out of mind.

SEE: The world’s most important cold storage facility safeguards the future of food (TechRepublic)

One issue is that storage is treated as a commodity by IT. Storage is cheap, so no one thinks much about having to order more disk or even tape when they need it.

But is storage really cheap?

Craig Hollins, business manager at Australian managed service provider PPS, spoke with Mitch Tulloch for TechGenix about the increasing costs of bandwidth and infrastructure that are often required to support more storage (even cheap storage) and also the substantially larger files that contain big data. More storage requires more money put into supporting resources like processing, networks, and personnel. Another reason storage-related costs can increase is when multiple versions of large files are kept because these files complicate disaster recovery and backup procedures.

The takeaway is that extra storage set aside for data that is seldom used or not used at all, especially if it is big data, demands resources above and beyond storage. This adds to the overall IT expenses.

Here are four steps companies can take to effectively manage their cold storage big data.

  1. Use inexpensive but dependable cold storage

For big data that is seldom used or archived, slow hard drives and tapes are the most commonly used storage media. The key is to test your disks and tapes periodically to ensure they are in good working order. Also, avoid the temptation of just relegating your older drives and tapes to archiving and data backup functions–these resources still have lifespans and are more likely to fail if they are older assets.

2. Consider cloud-based cold storage

If you don’t want to store your big data onsite or in a physical off-premises facility, going to the cloud is an option. There are a number of cold storage cloud choices, and you might find an option that is the most economical alternative to storing all of your cold data.

3. Perform annual evaluations of cold storage data

Just because you have a means of storing unused data doesn’t mean that you should routinely be storing all of it. If you haven’t already, now is the time to sit down with end user management and your legal department to determine which data you should keep and which you can discard. You should evaluate your cold storage data every year.

SEE: 60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)

4. Use data/storage automation

Most storage providers provide tiered data storage that is facilitated by artificial intelligence (AI). This AI takes the rules for storing data that you define and automatically applies them to determine where data is stored.

The primary tier of data storage is in-memory storage or solid-state drives, where your frequently accessed data is stored. Data that is intermittently used can be stored on a secondary tier of data that uses less expensive hard drive storage.

Data that is seldom used, or your cold storage data, is assigned to very slow disk drives or tapes that are your most inexpensive storage media. By taking advantage of this automation, you can be sure your heavily used data is always readily available to users at the same time your seldom used data is stored at the least cost.

Final remarks

Most big data storage management strategies focus on making data readily available to users in real time, but this also increases budgetary spend for storage and processing. Companies can help offset these larger expenditures by looking at the seldom used big data that they have under management so they can ensure that this data is being stored at least cost. For this data, cold storage media is a secure, reliable, and affordable solution.