The national governments of the US, the UK, and other G7 nations have been focusing more attention on the economic value of open data, as opposed to broader societal benefits.

While pointing to evidence that open data fuels economic activity is a good rationale for the release of relevant data sets, it’s far from the only impact that releasing government data can have upon the world. As I’ve explored in past columns, publishing open data can increase resilience against climate change, offer insight into healthcare costs and outcomes, protect consumers, and fuel accountability and transparency.

If national governments are going to invest time, money, and public attention on releasing data, they should also focus upon releases that have social benefits as well as economic outcomes. Last week, looking for fresh examples, outcomes, and emerging issues around these issues, I attended a forum on the social impact of open data hosted by the Center for Data Innovation in Washington, DC. (Video of the event is embedded below.)

If you watch, you’ll hear remarks on the social impact of open data (PDF) by Maureen Ohlhausen, Commissioner of the Federal Trade Commission (FTC), followed by a panel discussion between Daniel Castro, director of the Center for Data Innovation, Sandra Moscoso, deputy program manager at the World Bank, Brian Rayburn, lead data scientist at Symcat, and Emily Shaw, national policy manager at the Sunlight Foundation.

“Often, when people talk about government and data they focus on government as a consumer of information and how government should or should not be limited in the data it can collect and use,” said Ohlhausen. “We have an entire section of constitutional law dedicated to that topic.”

But there is another aspect of government data that isn’t discussed as much, except perhaps by the people in this room: Government as a producer of data. Federal, state, and local governments generate and store 3 massive amounts of data about themselves, about us, and about the world around us. Even before the very first U.S. census report, government has been producing large — and increasing — amounts of data. Government produces many types of data: Personal data, such as social security earnings, tax information, unemployment filings, and voter registration; societal data such as demographics, employment estimates, and economic indicators; and impersonal or scientific data, such as weather and climate measurements and geolocation data. There is great potential in applying powerful new big data tools to the rich troves of government data. The private sector could use the wide range of government-produced data to reveal new insights into difficult problems in nearly every area of human endeavor.

The discussion took place in the context of a new section of that profiles companies that use government data as a way of demonstrating the impact of its publication. The profiles raised a few eyebrows this past April, when the federal open data platform only featured examples of the economic impacts of open data. As readers of this column know, free publishing government data in a machine-readable format, under an open license, can have salutary economic outcomes ranging from real estate, health, transit, energy, consumer finance, and weather.

Over the course of the event, the panelists advocated for the release of open data that benefit citizens, not just startups and established businesses. To put it simply, beyond rationales of increased efficiency, reduced costs, increased productivity, and economic growth that will spur the release of new data, there’s considerable potential for open data releases to extend to positive social justice, environmental, educational, public safety and health outcomes.

Ohlhausen outlined a role for the FTC in regulating and guiding the publication of open data and its use in data analysis:

“By understanding the limits of big data and emphasizing the need for human judgment in the use of such tools, the FTC can help tamp down hype over big data,” she said. “The FTC can help create a healthier regulatory atmosphere by critically evaluating the claims of both the pop-science promoters of big data as a ‘magic bullet’ solution and the naysayers who fear massive consumer harm from all-knowing algorithms. A realistic understanding of big data’s potential will help the agency to identify and focus on actual harms to consumers, if they occur.”

The return on investment for open government goes beyond making government institutions and services more transparent, and the people that run them more accountable for the use of taxpayer dollars: In systems of governance that are of the people, for the people, and by the people, open government provides access to information about how those people are being governed and new opportunities to participate in that governance. That means that focusing on publishing open data with economic value shouldn’t preclude or take too much focus away from digitizing and releasing data with other societal value.

There’s also potential for increased risks to privacy, security, and discrimination, if rules, regulations, norms, ethics, and a careful approach to enterprise inventories, digitization, and data publishing aren’t undertaken as part of the process, or fuel the creation of applications and services that favor people who already are privileged in society. Ohlhausen spoke to those issues in her remarks:

“Obviously many — perhaps even the majority — of government data sets have nothing to do with ‘personally identifiable information,'” she said.

Open access to many scientific and economic data sets, for example, raises no privacy risks. However, opening other useful data sets may raise some privacy concerns. For example, applying big data techniques to government health data or education records could help address the most pressing societal issues we face, but people understandably worry about how such information is used and shared. The FTC can guide other government agencies on how to open access to data while mitigating privacy risks through aggregation, de-identification, use-based limitations, and other techniques. Furthermore, the FTC must continue to explore how to resolve the tension between the promise of big data and certain Fair Information Practice Principles such as notice and purpose limitation and data minimization, which, strictly applied, could hinder big data’s promise.

I published a series of tweets during the event with pictures, links, and references to cited research, projects, and services during the event, all of which I feature below.

During the brief question and answer period that followed, I had an opportunity to question the FTC Commissioner about the agency’s open data practices and took it. (You can watch her answers here.) To her credit, Ohlhausen followed up on Twitter with answers to my questions.

In her replies, she shared a link to the FTC’s open government plan and examples of newly released datasets, including a .csv of consumer complaints that she references. I found that the data didn’t include the name of individual companies, only aggregates by industry.

I hope the FTC takes a proactive approach to converting any data that it still publishes in PDFs as structured data online, leading by example, and uses Freedom of Information Act requests to prioritize future releases, including which companies are subject to the most complaints.