Storage is an often-overlooked area, but with the increase in big data, it's worth paying attention to.
Of all of the IT disciplines, storage has been the most neglected and unappreciated. Managing storage is often consigned to a junior member of the operations team. Compared to areas such as applications and networking, there are few industry certifications available for storage professionals. Storage professionals are often overlooked for promotions--and in many cases, there is no credible career ladder for a storage professional.
Storage is also neglected in IT resource management. Sites frequently underutilize storage, deliberately striping storage drives so they contain no more than 20% of the data that they are capable of holding. In other cases, storage is simply neglected. No one bothers to check where unused storage could be made available. Instead, they just purchase more.
The lesson is clear: if you want to improve the spend in your data center, you need to implement best management practices for storage as well as for other resources.
Here are six steps that CIOs and data center managers can take to optimize big data storage.
SEE: Quick glossary: Storage (Tech Pro Research)
1. Review your tiered stage strategy for big data
Most data center managers concede that they are carrying far more big data than they want to. Leading reasons are fear of discarding data that may be useful in the future and requirements for eDiscovery and retrievability of data and documents. However, none of this precludes storing data in ways that optimize both processing and storage.
Data that is seldom used, or that may never be used but that might be needed for legal purposes, can be stored on cold storage devices that consist of tape or very slow (and cheap) disks in either your data center or the cloud. Data that must be available for rapid, daily access can be stored on super fast solid state drives that are also very expensive. Somewhere between these two extremes is data that is occasionally accessed, and that can reside on mid-strength disk drives.
By identifying which data should go where-and then putting it there--you will drive your storage costs down.
SEE: Power checklist: Managing backups (Tech Pro Research)
2. Evaluate the costs of cloud-based (versus on premises) data scalability
Current wisdom says it's better to scale data storage upward for unusual peak data times in the cloud, because you are only renting this storage. However, there can also be hidden cost triggers that kick in when you exceed your normal data storage allocations in the cloud.
You should periodically evaluate much it is actually costing them to scale in the cloud--and whether this is really cheaper than scaling data upward in your own data center.
3. Inventory your storage resources and assess them for use
There isn't a site I know of that doesn't discover an inactive disk drive somewhere in its data center, in a data center storage room, or somewhere at a field location. If you don't have an up to date IT asset management system that tracks all of your assets, obtain one and start using it now. Storage should be one of the first areas you look into to see if it is being underutilized or not used at all, so you can see where you can increase its utility. If you find a storage resource that is obsolete, get rid of it.
SEE: Essential reading for IT leaders: 10 books on cloud computing (free PDF) (TechRepublic)
4. Evaluate your distributed data mart storage
This point goes hand-in-hand with the point above. Know where your distributed data marts (and storage) are, and how fully the storage is being utilized. If storage is being substantially underutilized, try to reallocate it to a greater area of need.
5. Enact edge storage policies and practices
What makes edge storage unique is that a majority of it is occurring in manufacturing facilities that use robots, artificial intelligence, machine learning and automation. Edge storage enables you to temporarily store data at local plant collection points and then upload the data when bandwidth becomes more available-perhaps as a batch nightly process.
Edge storage management can be a concern because in many cases, local manufacturing engineers or plant managers without storage backgrounds are being asked to manage it. IT storage professionals need to be on top of these storage devices so they can monitor overall health and also determine which machine-generated data is stored and which (like the communications jitter between machines) is irrelevant to the business and should be discarded.
SEE: Storage management software: The smart person's guide (TechRepublic)
6. Enact and practice data retention policies
The average company reviews data retention policies with users every few years, when these reviews should be occurring on an annual basis. Data retention and user access permission should be reviewed annually because these two aspects of storage are constantly changing. The exercise also forces users to decide which data they want, and which they can live without.
The end goal of all of these big data storage objectives is to optimize both the utilization and the spend on your storage resources, whether resources are on premises or in the cloud.
- Network-attached storage: The smart person's guide (TechRepublic)
- All-flash arrays: The smart person's guide (TechRepublic)
- Tape storage: The smart person's guide (TechRepublic)
- The best emerging storage tech of 2017 (ZDNet)
- Using virtualisation to streamline your storage both old and young (ZDNet)