at xtech 2007, felix michel will present our paper
Data Model Perspectives for XML Schema, which looks at the question of how grammars for XML document classes can be presented in different ways. the interesting observation is that the good old Document Type Definition (DTD) way of writing down grammars is not the only possible way, and in fact for some classes of applications may not be the most appropriate way.
the paper is the result of felix' master thesis on the
Representation of XML Schema Components, which produced not only a very neat XML Schema documentation tool (X2Doc, more on that later), but also a way how to represent and access the abstract structures of an XML Schema, something we dubbed Schema Component XML Syntax (SCX). basically, SCX is a way how to represent XML Schema components in XML. this opens the doors for more interesting things, for example our XML Schema Path Language (SPath), which extends XPath 2.0 in a way which makes XML Schemas accessible through (X|S)Path expressions as well. there are two main ideas behind this approach:
- make schemas more accessible to developers in general. XML schema is widely regarded as being almost impenetrable, conceptually as well as technically. there is not much that can be done to make the specification simpler without changing it significantly, but at least on the technical side there could be better support for actually working with schemas, which currently is very close to none (the W3C is hard-wiring XML Schema into many specs, but so far there is no API for it or no other way how people can access it programmatically).
- by having tools for accessing schemas programmatically, our final goal is to bridge the current gap between the very data-oriented XML world, which often simply treats data in a pragmatic and not very robust way, and the data modeling world, which sees XML as
just some serialization. there are too few connections between these two worlds.
while SCX is pretty stable, SPath is more an idea and a rough underlying function library than a complete language, but at some point in time we hope that people will stop just going fishing in tag soup, and instead will demand adequate tools for finding what they want to find: by using schema information to better process XML.
at xtech 2007, jeni tennison will present some related work (even though hers does not address XML, but instead the question of how to create schemas for languages allowing overlapping markup), also talking about fun stuff such as brzozowski derivatives. i am curious to see whether any of this will finally become mainstream, definitely not all of it, but the area of XML processing definitely has a bit of growing up to do, to be a better fit for the world of loosely coupled systems.