Friday, May 16, 2008

XO & XP

Microsoft will be supporting the XO laptop, also known as the One Laptop Per Child (OPLC) laptop. so far, the machine only was available with linux and a user interface which, well, is not that easy to use, at least not for anybody being used to windows-based interfaces (referring to windows as the main user interface abstraction, not to Windows the operating system).

there will be lot of ideological debates whether this is somehow compromising the idea and ideals of the OPLC project. Microsoft will get $3 per laptop, which is certainly an acceptable price. on the other hand, they will of course get the opportunity to further hardcode Windows into everybody's brain as the mental model of how a computer works. not bad as a long-term strategy for ensuring that Windows will remain the de-facto standard, even in new generations of computer users.

i am definitely looking forward to installing XP on our XO laptop, the operating system so far has driven me nuts. it may be designed according to the latest breakthroughs in activity-based user interface design, but it simply is not convenient to use for anybody used to windows-based user interfaces.

i am still interested to look into how the XO could be used as an e-book reader, and what i am most interested in to look at web-based e-books. so an XO running XP and firefox 3 would be a really nice experimentation platform. on the other hand i am wondering how well the not-so-powerful XO will perform when running Windows. there definitely is a reason why they talk about putting XP on it and not vista, i guess...

Wednesday, May 14, 2008

LocWeb 2008 Proceedings

after spending considerable amounts of time and energy to not only organize the LocWeb 2008 workshop at WWW2008, but to also get it published in the ACM Digital Library, i am happy to report that the LocWeb 2008 proceedings of the First International Workshop on Location and the Web are now available. the most convenient ways of accessing them are through the ACM DL site or using DBLP:

Tuesday, May 13, 2008

Integration with REST

i am currently working on a paper about loose coupling (basically trying to figure out what it really means, because everybody always does it, but nobody says what they mean by that), and this of course happens in the context of the Grand Web Services Debate.

i think one of the major problems in this debate is that the WS-* people and the REST people often start the debate by saying how do we achieve IT integration?, and i think this is the wrong question to ask, which of course is not very good starting point for having a debate...

this is the reason why REST is (and always should be) referred to as an architectural style (and not just a an implementation-level alternative to WS-*). the question is, if you build a massive-scale distributed heterogeneous information system, is it smart to use an approach looking for integration in the traditional middleware sense of the word?

i guess the big problem here is that typical enterprise IT people are not yet ready to let go of the traditional integration and middleware approaches, which basically tell you that if you try hard enough and spend enough money, you can treat your IT landscape as a homogeneous system, and that it is a smart move to try to get to that point. i think this approach is already demonstrating how inappropriate it is in its ability to adjust to the scale and frequency of changes in today's IT world, but as with many old habits, integration will be on the list of strategic IT goals for organizations for a long time to come.

it will take some more time until the majority of decision makers realize that REST is not something you use to achieve integration. it is something you use when you have learned that integration is the wrong goal to have.

Wednesday, April 30, 2008

HTML in Atom

one of the joys of Atom is that processing text in Atom is not quite as easy as it seems. the reason is that Atom supports three types of content for text constructs, text, html, and xhtml. this means that if you want to upconvert Atom feeds to pure XHTML (i.e., turn escaped HTML into proper XHTML), you have to deal with the possibility that HTML content not only may be non-XML, it may actually be broken HTML that has been produced manually or by broken tools.

browsers have to deal with broken HTML all the time, studies of real-world HTML pages on the web show that the overwhelming majority of HTML pages on the web is broken. my guess is that this ratio will be much better for Atom, but there probably still are quite a number of feeds which contain HTML snippets generated by hand or by broken tools.

what this comes down to is that a good Atom implementation should process HTML content in a fault-tolerant way. browsers implement their own proprietary parsing, which not only leads to various interpretations of the same (broken) HTML, but also makes it hard to decide on the right way to fix broken HTML.

luckily, HTML 5 introduces its own parsing model. it starts with parsing a sequence of bytes, and has a second phase which works on unicode characters. if you are operating in an XML environment, however, you already have proper unicode to work with, so Atom can skip the byte parsing process.

my initial idea was to try to implement that algorithm in XSLT, because it would be the ideal candidate for turning HTML in Atom into XHTML, so that such a cleanup process could be based entirely on XML tools. however, so far the specification of the parsing process looks pretty much impenetrable to me, it looks mostly like spaghetti code that has been translated into english (actually, the writing style of that part of the HTML spec reminds me a lot of the XML Schema spec, which also does not really excel at clarity).

i have some general doubts about using XSLT for a job like that, because parsing like that probably does not really work too well with XSLT's language design. but at least i would be interested to see whether it's possible. or more generally, has anybody implemented the HTML 5 parsing algorithm? in any language? since it is presented in a rather unstructured way, it is hard to validate by eye. is there any validation that is works in principle? maybe a state machine or something along these lines? and is there some assurance that the text truthfully and completely describes the algorithm?

don't get me wrong, i really think it would be great to have some well-defined way of how HTML should be parsed, that would be the right step to make browser behavior more predictable. but the current spec may need some improvement. or maybe i just have to spend more time reading it? or maybe somebody else is interested in implementing it in XSLT?

Tuesday, April 29, 2008

FOAF me!

after visiting the WWW2008 conference in beijing last week, i am once again curious to find out more about all the stuff that is going on on the web, both in terms of technologies and in terms of pragmatic and social practices. i am still unconvinced that all the fuss around the semantic web is justified, but i still want to believe.

so after not updating my FOAF description for a couple of years (1536 days, to be exact), i have now generated a new version, using the vCard export from LinkedIn, and then feeding that into FOAFgen. the resulting FOAF in all its RDF/XML glory has 558 triples, the W3C RDF validator reports.

now i am waiting for some FOAF crawler to come by and discover and read my FOAF and then something will happen, i hope. i don't know what, but i hope it will be a good thing!

Twitter Updates

    follow me on Twitter