My inner geek find the large quantity and type of data available on Data.gov fascinating. The site’s expanded offerings were announced on the one year anniversary of the U.S. government’s digital strategy directive, which has a big focus on mobility.
The variety of data on Data.gov is overwhelming, so it helps to have an idea of what you want or need before you start perusing the site. I wasted a lot of time trolling Data.gov and finding so much that was interesting but had no real application for my development work (the White House visitor records is a great example). Data.gov includes everything a developer needs to utilize its massive amount of data.
Show me the data
Part of the U.S. government’s digital strategy is to make as much data available as possible. This was confirmed a couple of weeks ago with the open data policy, which says all future data collected should be open and accessible — a cornerstone of transparency.
Data.gov offers two basic avenues for accessing data: raw data and programmatic access via APIs. The data option spits out data in a downloadable format; one example is a list of FDIC Failed Banks as a comma separated file. The APIs provide an interface to the data, allowing you to filter the data for specific consumption.
What is available?
An analysis of the Data.gov offerings by Nextgov.com (Google spreadsheet) showed many agencies have met the transparency goal. Data.gov offers data access by way of more than 400 APIs. The API list is broken down by office, and the Developer Hub lists agencies and programs. Another page provides a breakdown of the number of datasets available per federal agency.
Most of the APIs return data in XML or JSON format. The Census Bureau APIs are a good example, as JSON data is returned via a simple HTTP GET request. However, a key value is required that is obtained via a simple registration process. The USA Spending API, which uses HTTP GET requests, returns XML formatted data. The Federal Reserve’s API called FRED allows you to retrieve data via HTTP GET requests. In addition, many agencies or departments offer RSS feeds as well, like the White House Press Office feed.
The Developer Apps Showcase features examples that demonstrate how others have utilized what is freely available on Data.gov. A fun example is the Eat Shop Sleep mobile application, which helps you find nearby restaurants with specific characteristics, shops, and hotels. The Software Challenges page lists real development problems that need solving.
An interesting aspect of the federal government’s open approach is the trickle down to lower levels of government at the state level and below. This link on the site provides a running total of states and local governments, as well as international groups that are using this approach — it includes a clickable map below the numbers. At the time of this writing, it showed 39 states on board, but this can be misleading as the level of adoption varies. My home state of Kentucky is listed, but it only includes a small amount of data, whereas my home city of Louisville offers quite a bit.
The Open Government Platform (OGPL) provides an open source version of Data.gov that can be used by any government (national, state, local, etc.) to promote transparency. It is built using PHP. (An interesting note: It was developed via a partnership with India.)
A beta version of OGPL is available via GitHub; India is using the current version for its data portal site. As with any open source project, you are invited to participate in its evolution. In addition to OGPL, there are a number of other government open source projects hosted on GitHub.
If you need to know something about the government, Data.gov might provide the raw data to answer your questions. The trick is navigating the site to get the data and/or API necessary to fulfill your need. The Developers forums section allows you to discuss topics with the community and contribute to the code development. The open source nature of the data and code on the site is refreshing, but the cynic in me knows there is so much more data we cannot access.
You can follow Data.gov progress and announcements via Twitter.
Keep your engineering skills up to date by signing up for TechRepublic’s free Software Engineer newsletter, delivered each Tuesday.