one of the main principles of REST is what unfortunately most often is called Hypermedia as the Engine of Application State (HATEOAS), which must be the worst acronym ever invented. regardless of that, this is one of the central REST constraints, and also one that is often ignored, with services using no discoverable links at all, or using URI templates and assuming that this already is RESTful. it can be, but it often isn't, because even if there was a stable standard for URI templates (and there isn't), it still would be impossible to know how to substitute the template variables to get URIs of available resources. which is why RESTful services should include links in resources, so that applications can simply follow those links.
in web architecture, REST's links are found by knowing a resource's media type, and knowing how to find the links in that format. the most popular linked formats on the web probably are (X)HTML and Atom, both of them define links and link semantics in their data format. however, since XML is such an popular syntax for many different content types, i have been toying with the idea of defining a generic XML mechanism for discovering links in XML (the only generic descriptions for XML available today are the built-in xml:lang, and the separate xml:id specification). notice that this is different from datatype mechanisms such as XSD, which are able to type content as a URI, but it is still not clear whether this URI is a link, and if so, what its semantics are (which i will call role from now on). as a starting point, here is a brief list of requirements:
- link discovery should work inline (information embedded in XML instances) and out-of-line (as an external description of how to find links in some XML).
- link discovery can be a
perspectiveof a media type, i.e. different applications might have different ideas of how to find links in a given media type, or a subset of documents of a given media type.
- links can have roles, and are allowed to have more than one role. role names can be strings, qualified names, or URIs.
- the mechanism should rely on the smallest possible set of technologies, so that it can be easily implemented using widely deployed tools and technologies.
- there should be a simple syntax for fully describing the link discovery information, so that it can be serialized and exchanged.
starting from this simple and very likely incomplete set of requirements, here is my first stab at a mechanism for Link Discovery in XML (LDX). the main idea is that links are identified by XPath-based selectors (XSLT's pattern subset might be a good match here), that selected links are associated with roles, and that this mechanism can be applied to any XML-based media type. what would be the advantage of such a mechanism? a simple way to describe the links and link semantics of XML media types, the ability to build generic tools that can extract links from any XML-based media type, and the ability to better separate general REST functionality (what links can i follow in this resource?) from content-specific functionality (what does the content mean, and what does it mean to follow the links?). generally, such a mechanism could become part of a REST toolbox, and there are advantages for both service providers and consumers.
now on to the specifics. it's a bit harder to come up with a good inline format (HLink still might be remembered by some), so my first pre-alpha idea focuses on the out-of-line format. it's a simple set of selectors associated with 0-n roles. each selector selects an attribute or element content (interpreted as the string value), and this content is then recognized as a link with the associated roles. selectors probably should be able to be grouped and contextualized, so that for example some minimal LDX for XHTML might look like this:
<link match="html:link[@rel='stylesheet']/@href" role="stylesheet"/>
<link match=".//html:img/@src" role="image"/>
this would allow to automatically extract all links to stylesheets and images from XHTML documents. the role names in this case are just strings, and there definitely has to be some better handling of non-string role names, in particular differentiating qualified names and URIs (which could, at least in theory, use CURIE syntax) might become a bit of a challenge.
namespaces (in the described media types) are a headache as usual, the eternal question being what to do with unprefixed names in selectors. use the LDX document's default namespace (à la XSD), or have a special attribute for setting the default namespace (à la XSLT)? i prefer the latter, but that's more syntax juggling than anything else.
the fun part is that because of the selectors, applications can even try to be smart and create LDX that works for the specific content they're concerned with. let's imagine a RESTful application using XHTML which contains links, and always contains a link to a product in the first column of each row of a specific table. using a selector such as
table[@id='products']/tr/td/a/@href, it would be possible to associate these links with a specific role, and with LDX support in a REST toolbox, finding all those links would be trivial. and the LDX description of the assumptions made about the content would be concise and declarative, so that it would be easy to understand the linking assumptions of the application.
most of LDX currently exists in my head only, and right now i am mostly wondering what others think of that idea. does this make sense from a RESTful point of view? would it be a worthwhile addition to a REST toolbox? maybe even to the REST landscape in general? and, assuming it does make some sense and i am tempted to start writing down a bit more of my ideas, should i start with a W3C or with an IETF template?