together with my colleagues eric kansa and raymond yee, we have just published a new paper about
Web Services for Recovery.gov as UCB ISchool Report 2009-035. the main goal of this report is to highlight a side of the
open and transparent data debate that is often overlooked. some people look a lot at web site design in terms of HTML/CSS design and usability/accessibility, and other people look a lot at how to make data available, focusing on what essentially boils down to building download portals. our main contribution is to highlight the fact that download-oriented data access is not sufficient to provide timely access to evolving data sources, for two main reasons:
- data is never complete. data is always connected to more data, and download formats often disrupt this connection, making the data decontextualized and making it harder to understand how it is embedded into the world. once data is decontextualized, it can be very hard or almost impossible to contextualize it again, re-establishing all the connections to related data that existed where the data originated.
- data is never static (unless it's purely historical data). downloads not only are decontextualized in terms of connectivity with other data, they also are decontextualized with regard to time. any data source that is frequently updated must have simple ways for synchronizing data, so that more efficient ways of data access than periodical downloads can be established.
starting from our earlier report on how ARRA reporting should use lightweight and web-based technologies, we are again focusing on technologies which are powerful enough to implement dynamic and machine-friendly services, but on the other hand also simple enough so that citizen developers can easily understand and use them. refining our earlier report, we are now focusing on widely established and supported technologies such as XML and feeds. we make concrete recommendations how to build web services for recovery.gov that are both more sophisticated than download approaches, while still being easy to implement both on the provider as well as on the consumer side. here's our report's abstract:
One of the main goals of the
Recovery.govWeb site is to provide information about how funds for the American Recovery and Reinvestment Act (ARRA) of 2009 are allocated and spent. In this report, we propose a reporting architecture that would focus on the reporting services rather than the Web site and page design, and that uses these Web services to build the user-facing part of ARRA reporting. Our proposed architecture is based on simple and well-established Web technologies, and the main goal of this architecture is to provide citizens and watchdog groups simple and easy access to machine-readable data. Our architecture uses a more sophisticated framework than simple downloads of data files. Our proposed architecture is based on the principles of Representational State Transfer (REST) and uses established and widely supported Web technologies such as feeds and XML. We argue that such an architecture is easy to design and implement, easy to understand for users, and easy to work with for those who want to access ARRA reporting data in a machine-readable way.
you can find the
Web Services for Recovery.gov report in the iSchool's eScholarship repository. any feedback about the report is very welcome, and please make sure to also visit our new
Architecting Transparency web site, which has been specifically built to highlight and demonstrate some of the issues described in the report.