one of the interesting developments in the highly anticipated more tech-savvy obama administration is that the Initial Implementing Guidance for the American Recovery and Reinvestment Act
explicitly mention feeds (page 54: preferred: Atom 1.0, acceptable: RSS
) as a way of how information on spending has to be made available. it is certainly nice to see a government recognizing feeds as an important mechanism for information dissemination.
however, it seems that the feed is mainly intended as a trigger and not as the actual container of useful data. it seems that the actual data is expected to be made available in some format made available as a template at https://max.omb.gov/community/x/doC2Dw, but since this is a protected site, it is not even possible to find out what data format this template is using. in an ideal world, it would be XML, but something tells me that we are not yet living in an ideal world.
starting on page 55, the document becomes a bit more confusing. while there is a rather long list of interesting data elements that certainly would be interesting to aggregate across agencies, there is no actual syntax defined, and the field types
look a lot like SQL datatypes. so while it would be nice to get all this information in feeds, my guess is that we will get feeds that will point to templates that will contain the data, and these templates probably will use some aggregation-unfriendly formats such as PDF or Excel. so in theory we get feeds, but in practice we probably only get feeds as a trigger and then have to deal with non-feed data.
what i am thinking about is creating a platform that aggregates all agencies' feeds, consumes the templates as new data becomes available, parses them, and republishes all information as well-designed feeds where all the data elements on pages 55-57 are encoded as Atom extension elements. this could even go a step further and for example geocode address information in the data elements and add GeoRSS to the republished feeds, allowing map-based applications to plot grant allocations on a map.
[[ february 26, 2009: to make this commetary a bit more constructive, here are some details of how to get the stimulus feed architecture right. ]]
This is a great idea on many levels. I will be watching w/ interest!
Posted by: Peter Keane | Tuesday, February 24, 2009 at 11:44
Ugh. RDBMS column specification?
Count me in for help with geocoding and aggregating.
Posted by: Sean Gillies | Tuesday, February 24, 2009 at 13:04
/me raises his hand to help out where possible as well.
I agree that linking to the more 'opaque' formats like Excel and PDF are not ideal - but at least it's getting in the right direction. It's been a long push to get to this point where the idea of "feeds" even shows up in requirements.
Having normalized GeoRSS Atom feeds would be *tremendous*. That's something we could pipe through GeoCommons and compare against other data that was also left strewn about in opaque formats (you know who you are shapefile).
Posted by: Andrew Turner | Tuesday, February 24, 2009 at 13:46
do you know if there any European or any other international authority doing it better? Would love to see how they do it.
Posted by: Kesava Mallela | Tuesday, February 24, 2009 at 16:53
also, is giving up control over data/feeds giving up control over their department? if so, I dont expect many depts to go full throttle even if Obama wants it so.
Posted by: Kesava Mallela | Tuesday, February 24, 2009 at 16:56
@Kesava, we can track those who do not comply with the feed requirement. It will be interesting to see if social pressure will drives participation... Cheers all, Nat
Posted by: Nat Wharton | Tuesday, February 24, 2009 at 17:17
@andrew: i agree that seeing feeds in requirements is encouraging. but then again, it worries me to see it so poorly designed. and it would not have been such a hard task to make it at least quite a bit better. and if all of this turns out to be not all too great because of the poor design, people will point at feeds and say: "see, those feeds are not all that great, i knew it all the time."
and then it won't matter for many looking at this believing that it was the feeds and not the way they were used. maybe (actually, hopefully) i am overly paranoid here, but it happens quite a bit that people blame technologies for scenarios where these technologies were simply used poorly.
Posted by: dret | Wednesday, February 25, 2009 at 17:32