Tim O'Reilly on open data: Cheap may be open enough

Open source and open data. Is cheap and accessible an acceptable substitute for open data?

Open data makes for good business and good government, but is cheap and accessible an acceptable substitute for open data? Definitely maybe, according to Tim O'Reilly in an interview with McKinsey, one of the industry’s most vocal advocates for open source and open data. In O’Reilly’s world, openness is not an end in itself, but rather a means to an end and, hence, delivering the effects of openness may be a good enough surrogate for truly open data.

Openness as ideology

The open source versus free software debates of yesteryear have mostly faded. As an industry, we seem to have settled into comfortable patterns of using permissive, Apache-style licensing to attract developers while retaining the ability to blend proprietary functionality to attract customers.

For other projects where coherence in a common core is necessary, more restrictive GNU General Public License-style licensing is sometimes used, as in Linux. Even among GPL adherents, it’s no longer about ideology and has become a pragmatic development or business choice.

None of this really gets anyone mad anymore.

Data changes everything

At least, until data came along. Data, of course, has always been important, but only in the past few years have we come to appreciate it as the secret sauce that can drive revenue, improve government services and more. Tim O’Reilly began preaching the data gospel in 2005, and slowly the industry caught on.

Today, as much as we obsess over the importance of big data, Gartner argues that those “seeking competitive advantage through direct interactions with customers, partners and suppliers, open data is the solution.” Open data is the new open source, with much more at stake.

And yet O’Reilly insists in his McKinsey interview that we may be overselling openness in open data. As he notes, “There’s a pragmatic open and there’s an ideological open. And the pragmatic open is that it’s available. It’s available in a timely way, in a nonpreferential way, so that some people don’t get better access than others. And if you look at so many of our apps now on the web, because they are ad-supported and free, we get a lot of the benefits of open. When the cost is low enough, it does in fact create many of the same conditions as a commons.”

In other words, there are different kinds of openness, and having an open source or open data license/commitment is not the only way to deliver user or developer value.

Take, for example, Google Maps. Google doesn’t open its maps data, though it starts with open, public data from the U.S. Census Bureau (for U.S. maps, obviously). And yet we don’t complain about this because Google makes its Maps product available to developers programmatically through its Maps API, and to consumers through a free service that costs us nothing...except our location data that we feed it.

Truly open? No. Open enough? Yes.

Paying with our data

The hitch, of course, is that “open enough” can easily become “not open at all.” We faced this in the Open Core debates within the open source community (namely, have an open-source core and then sprinkle proprietary add-ons around the periphery of a project), and we will undoubtedly face the same discussions within the open data movement.

After all, as O’Reilly posits, a strategy of “accessible but not open” requires great restraint...because it becomes say, ‘Well, actually we just need to take a little bit more of the value for ourselves. And oh, we just need a bit more of that.’ And before long, it really isn’t open at all.”

That’s the danger.

Complicating this risk of overreach, as Tom Lee suggests, is the fact that “most of the benefits of open data will accrue to consumers and citizens, not to investors and firms.” Neither the end-user nor the vendor of data can afford for the balance to tip too much in either direction. The better companies can monetize accessible data, the more of such data there will be. If it’s impossible to monetize 100-percent open data, there will be very little of it created, at least by profit-seeking enterprises.

I don’t think we want that. In open source, there is exactly one pure-play, large open source vendor: Red Hat. There are, however, many companies built on open source, companies like Google and Facebook, which give away free services on top of this software, all paid for with our data, data that is often made open...enough.

