In October, 2009, the Obama administration announced that the National Archives and Records Administration and the Government Printing Office will begin publishing the Federal Register in XML. This is a major step forward for open government and is probably as significant as the creation of the register itself.
The Federal Register was created by the Federal Register Act of 1935 after a Supreme Court case (Panama Refining Company vs. Ryan 293 U.S. 388) that ruled that companies could not be held responsible for violating interstate trading regulations if the regulations were not made public.
President Roosevelt had already appointed a committee of the National Emergency Council to study the idea of publishing a gazette of orders issued by the Executive Branch but the Supreme Court decision effectively ended the work of that committee and on July 26, 1935, the Federal Register Act was signed into law. The Act created a lasting partnership between the National Archives and the Government Printing Office (GPO).
Despite the significance of this daily publication of all government activities, the Federal Register has not exactly been the most accessible document. It is lengthy and incredibly dense, and it’s not exactly morning reading for most people. It was also not always easy to get your hands on. Roughly 6,000 copies are distributed to libraries and research centers across the country, but there hasn’t really been an effective distribution method until the internet came about. The GPO has been making the Federal Register available in PDF and HTML for some time now, and in 2009, over 200 million federal register documents were downloaded from the GPO website. That is a significantly wider distribution than could ever be done with printed issues.
Even so, the PDF version of the register is still cumbersome and difficult to use. It is organized on a department-by-department basis and it is not easy to determine the scope and geographical reach of the items published. Notices about public consultations on regulations still frequently get overlooked by everyone except those closest to the issue – lobbyists, advocacy and watchdog groups, laywers, and academics. And it comes out every day, chalk full of new information. Keeping track of things that might affect you would be a full-time job.
So the announcement that the GPO and the National Archives are moving to XML is a welcomed step forward for improving access to information about the federal government. It opens up a world of possibilities for developers and web designers eager to publicize and distribute government information.
Since the announcement, a few exciting projects have been exploring these possibilities with excellent results.
FedThread.org was created by Princeton University’s Center for Information Technology Policy. The website downloads the daily XML version of the Federal Register from the GPO’s FDsys site and displays the entries in a readable format. It is slick and streamlined and has a really nice comment feature that lets users comment on individual sections of an entry in the Federal Register. Browse through the register and if you see a little icon like this:
then you can click on the sidebar and see the user’s comment. You can also make your own comments, or “notes.” The site provides direct links to the official PDF and HTML version of the register entry and provides an RSS feed for users’ notes. You can even set up email notices using FeedMyInbox. There is also a decent advanced search option that allows you search by agency and subagency and prompts you with suggestions as you type.
FedThread is a great site to simply check what’s been published in the Federal Register. But it doesn’t offer a whole lot in the way of interaction or social networking and it doesn’t exploit the full potential of XML publishing. Another site that is sure to set the bar for future government data repositories is GovPulse.
GovPulse really does it all. It gathers Federal Register XML from http://data.gov/details/101 and exploits the XML to its fullest. The site also adds a whole lot of functionality with tools like Ruby, Ruby on Rails, Geokit, and Cloudmade APIs. The site is designed to help users actually get involved with the federal government. The homepage shows public comment periods that are closing within the next seven days and comment periods opened in the last seven days. It also shows rules that will take effect in the next seven days and rules that have been proposed in the last seven days.
In this way, GovPulse is not simply presenting a daily XML feed in a readable format, it is harvesting information published over long periods of time and compiling it in a sensible and meaningful way. I can browse by agency, topic or publication date, or I can search by keyword, document type, agency, publication type, or even geographic location. And a nifty set of charts created with the Google Charts API shows the rate of publications by different departments over a year, five years, or in the current quarter.
Each entry is presented in a well-structured way and GovPulse does a number of things to make it easier to interact with the government and with others. Contact information for each entry is pulled from the text and displayed prominently on the right sidebar and a link to submit a formal comment on Regulations.gov is provided. Links to share the entry using various social networking sites are listed, and related topics can also be found in the sidebar. When you view a topics, you can see the most recent entries with that topic and pie charts of agencies using the topic and the type of entries associated with that topic. The developers clearly tried to make it as easy as possible for anyone to use the Federal Register.
The geocoding of entries in the register is probably the most useful feature. A map on the homepage shows you entries that affect or have affected your location. It is impressive and it does not just geocode U.S. locations. From my map of Atlantic Canada, I can see all the Federal Register entries that have ever mentioned Nova Scotia (going back to 1994 at least), right down to the postal code. The map also tells me agencies that are active in my location.
GovPulse is a great example of the kinds of resources that can be created using XML data. The site makes use of a whole suite of open-source tools, programming, and straight-up innovation, but these tools could not work together without consistently formatted XML data.
The Federal Register is probably the most prominent publication the U.S. government is releasing in XML, but it is certainly not the only one. The FDsys site has a variety of other publications available in XML and Data.gov has huge sets of data formatted in XML, CSV, KML, and other interoperable file formats. Other government sites like Regulations.gov are almost certainly making use of these advances in text encoding too. It’s encouraging to see the government move in this direction, and it has spawned other efforts to create and share open government data. The UK now has Data.gov.uk and the Guardian newspaper has created a World Government Data Bank. Already sites like Guardian API Maps are popping up to make use of these new tools and help improve them.
Groups like the Sunlight Foundation have actually been at this for quite a while now, but it looks like things are picking up steam now that governments are releasing data in open and accessible formats. Programmable Web has a nice compilation of other government data mashups and APIs.
If you are interested in getting involved with efforts to improve and change data.gov, check out this discussion on IdeaScale. If there are other government data projects you think I overlooked or should know about, then please let me know!