the OMB will release an updated set of stimulus reporting guidelines this week, at least this has been suggested by some unofficial sources. it will be very interesting to carefully look at the guidelines. the initial guidelines, published in february, got us very excited, because they looked as if somebody actually had a pretty good grasp of openness, transparency, and web architecture: they specified information flows to be based on feeds, and looked really promising. it was the first time feeds were recognized as a relevant way of moving government information around.
we had some quibbles about the way how the feed architecture was underspecified, had a closer look and started suggesting what exactly needed to be fixed, and eventually published a report how it would have to be fixed. that report did not specify everything down to the last detail, but would have been a good starting point to get down to business with the feed-based architecture. in the meantime, only few agencies really published feeds, even fewer of those feeds were findable and usable, and the real reporting was done by email, invisible to anybody other than the OMB back office.
to a large extend, this cannot be blamed on the agencies, but has to be blamed on the guidance, which left open many crucial questions, and also provided no help for agencies to publish feeds, or to test and validate their published feeds. this led to odd developments such as NASA's monofeeds, where each report is published in a new feed.
when the guidelines were updated in april, we were disappointed that the section that would have been essential to make openness and transparency work, the section about feeds, remained basically unchanged. this would have been the section that would have required a serious overhaul, and instead of getting it to the point where it could serve as implementation guidance for agencies, reporting still is being done by email in the most intransparent way imaginable. it will be interesting to see whether the next update of the guidelines will eventually get rid of email reporting and make feed-based reporting mandatory, based on well-defined and well-supported guidelines.
given that the budget for recovery.gov is $84 million, spending a small fraction of this on specifying the actual reporting architecture would be critical. if it is feed-based as mandated in the two initial guidances, then a lot of the back-end black magic can be turned into open and transparent feed-based data management. however, it seems as if the web site overhaul and the reporting architecture will be mixed up, and that is not a good idea to begin with. transparency means that the reporting data is available from its sources, the agencies and the states. recovery.gov then can use this data in the same way as anybody else who wishes to work with the reporting data.
the most transparent way to go would be to clarify the feed guidelines, provide a Federal Feed Cloud as supporting infrastructure to the agencies who do not wish to host their own feeds, and have recovery.gov working on the data in that cloud. instead, what might happen is that the recovery.gov overhaul is outsourced as one project, and whoever gets it will implement a traditional silo with intransparent back-end processes, and a nice and polished front-end to make up for it.
I'd like to know just how varied the various agency infrastructures are... that is, if I wanted to write the back-end stuff, just what the hell do I have to support? That question scares me.
Posted by: joe | Monday, June 08, 2009 at 21:35
@joe: the whole idea of a simple and standards-based architecture is that it minimizes the effort it takes to participate (for publishers *and* consumers). producing feeds often can be done with feed packages already existing, or at the very least you'll have some XML support and can create feeds at that level. if the guidelines specified templates and examples, it really wouldn't be all that scary. our stuff at http://isd.ischool.berkeley.edu/stimulus/feeds/ has been scraped from recovery.gov with some PHP and then we use XSLT to produce feeds and XML reports. it's not pretty, because we have to deal with the excel data published by recovery.gov, but it's not rocket science either and probably did not take more than a week or two. so it's really doable, and it's not hard, either.
Posted by: dret | Monday, June 08, 2009 at 21:43