Information stored in data centers but never used is expected to hurt the planet, wasting an enormous amount of energy, Veritas found.
Dark data, or data that is stored but never used in data centers, consumes an inordinate amount of energy, resulting in wasted carbon dioxide. In 2020, approximately 6.4 million tons of carbon dioxide will be unnecessarily released into the atmosphere because of this data, Veritas Technologies data found.
The focus of this year's Earth Day on April 22, 2020, is climate change, making the consideration of data's impact on the environment even more prevalent. 6.4 million tons of carbon dioxide equates to the emissions of driving a car 575,000 times around the Earth or the annual output of 80 individual countries, according to a Veritas infographic.
SEE: Big data management tips (free PDF) (TechRepublic)
"Lots of transactions occur over the course of doing business: Emails come in, files are sent to us. All of that data is just stored in a repository, which isn't indexed or maintained or monitored or managed in any way; it just continues to grow over time," said Doug Matthews, vice president of data protection and compliance at Veritas. "For some reason, organizations just tend to keep it."
An average of 52% of all data stored by businesses worldwide is considered "dark," as those who manage it aren't aware of its content or value, according to a press release on Tuesday.
Digitization is dominating the enterprise, especially as artificial intelligence (AI), machine learning, and the Internet of Things (IoT) gain ground. An IDC report found that the amount of data stored in the world will grow from 33ZB in 2018 to 175ZB in 2025.
Unless data users change their operations, 91ZB of dark data will exist in the next five years, which is more than four times the amount stored today, the Veritas data found.
An area of 7,500,000 acres of forest is necessary to absorb that amount of carbon dioxide. This amount is 500 times the size of Manhattan, as shown in the infographic.
"I used to work in the data center and as I walked around the data center, I could always tell when I was coming up on the storage array because it got hot," Matthews said. "Storage arrays are just nothing but energy consuming devices that spin disks. It takes a massive amount of energy to spin those disks up and maintain that storage.
"And at the same time, it has a huge generation of heat. So, it does have a pretty significant environmental impact when you think about the carbon footprint of these organizations," Matthews noted. "Everybody's looking at, 'Let's drive a few fewer cars,' or, 'Let's use a little bit less paper.' But if we could just turn off half of our drives, it'd be a far more significant impact on our carbon."
How to delete dark data
"You can't just ignore this problem," Matthews said. "If an organization is concerned about their carbon footprint, they should prioritize this situation as something they want to solve."
Matthews said they can do so through identification, policy, and then procedure program. The release specifically outlined tangible steps companies can take within that guidance.
- Identify all data stores and obtain overview
Enterprise teams need to understand how information is flowing through their organizations; this can be achieved through data mapping or data discovering. Gaining visibility to where sensitive information and data is being stored, those who have access to it, and how long it is planned on being kept, is crucial in keeping dark data at bay.
- Highlight dark data
Using proactive data management can help companies better see their data, backup infrastructure, and storage. With this, companies can gain control of risks associated with data and make better decisions regarding what data should be deleted.
- Automate the discovery and data insight processes
As data proliferates the enterprise, companies must keep pace by automating analytics, tracking and reporting that is necessary to follow dark data, security and file use. This practice can result in the handling of billions of files and petabytes of data, so companies should be prepared to archive and back up important data to prevent loss.
- Minimize place controls surrounding data
Minimizing and limiting data can help organizations reduce the overall amount of data being stored and keep track of the data being collected. Classification, flexible retention, and compliance policies facilitate the deletion of nonessential information.
- Ensure consistent adherence to compliance standards
Along the way, companies must maintain compliance standards like GDPR and be able to monitor breach activity across all data. The more data an organization has, the more they have to monitor for cybersecurity issues.
For more, check out How to break down data silos: 4 obstacles and solutions on TechRepublic.
- How to become a data scientist: A cheat sheet (TechRepublic)
- 60 ways to get the most value from your big data initiatives (free PDF) (TechRepublic)
- Feature comparison: Data analytics software, and services (TechRepublic Premium)
- Volume, velocity, and variety: Understanding the three V's of big data (ZDNet)
- Best cloud services for small businesses (CNET)
- Big data: More must-read coverage (TechRepublic on Flipboard)