You’ve heard of activists trying to save the bees and save the trees, but now there’s an increasingly concerted effort to save endangered data.
Software archiving is nothing new, from organizations such as Archive.org, Bitsavers.org, the federal government’s National Software Reference Laboratory, and many smaller players who’ve all been working for years to post applications online for public download or at least for browser-based emulation.
SEE: Visual history: Windows splash screens from 1.01 to 10
It never was easy, and now it’s becoming more difficult. Preservationists are joining resources because they realize that programs are going cloud-native, upgrades are increasingly transparent to users, and how do you take snapshots of a program that’s reliant on constantly changing infrastructure?
“The Software Preservation Network (SPN), we make no claims that we’re the first people,” noted SPN’s Jessica Meyerson, a digital archivist at the University of Texas at Austin. “Many archivists, information professionals, and just individuals… have become the caretakers and maintainers of legacy software just because they see the value in doing so. We ground our current activities as just the next step in the 30-plus years in the discourse of archival literature.”
The SPN was formed in 2014 and will reach the end of its grant in September 2017. Meyerson said she believes their roots are strong enough for the coalition to stand on its own. A meeting in summer 2016 led to a community roadmap, and legal professionals attended to help the group achieve its mission while still following copyright laws. Together they created a five-year roadmap emphasizing curation, governance, infrastructure, legal issues, and metadata, she explained.
Even the United Nations is on the case. UNESCO on April 3, 2017 announced an agreement with the French organization National Institute for Research in Computer Science and Automation (INRIA) with a stated goal “to contribute to the preservation of the technological and scientific knowledge contained in software [and] promoting universal access to software source code.”
“This agreement between UNESCO and INRIA aims to foster an international debate and actions in favor of a universal access to all digital documents, but also to preserve scientific and technical knowledge contained in software,” the groups jointly stated.
Meyerson emphasized that plenty of hard work is still ahead. “There are a lot of high-energy, high-talent people trying to work together. That being said it’s still the early days,” she said. The reason for that analysis despite all the efforts so far? Lack of corporate involvement.
“People have reached out to companies, but it hasn’t been a systematic effort. We have a little bit of work to do about what is our message as cultural heritage before we start approaching software companies. We’re also developing more of technical literacy,” Meyerson noted.
Still other organizations such as the Digital Library Federation and the National Digital Stewardship Alliance are sponsoring a series of conferences called Endangered Data Week, April 17-21, 2017 at locations around the world. As with the SPN, there’s not yet much of an enterprise focus–that is something which may reach the agenda next year, Federation program associate Katherine Kim said.
SEE: Video: The world’s digital history is stored in this old church (CNET)
Some in the corporate world are indeed paying attention. Danny Allan–a cloud and alliance executive at data backup specialist Veeam, which spent the past few years successfully competing against giants Commvault, EMC, IBM, and Veritas–said companies are starting to ask about backing up their custom applications, not just the user-created data.
Backing up compiled software products is one thing, however, “That is very different in the enterprise world. In the enterprise world, they will write internal software but is far more agile,” Allan explained.
Allan also noted that internally developed software can be dynamically created from scripts, built from mashed-up data servers, and delivered on virtualized operating systems or through streams. Whole new approaches need to be made to archive such programs, while applications that include links to Internet of Things hardware or which cross into other company’s infrastructures could be exponentially more difficult to archive.