there has been a rather long history of people trying to
describe REST applications, and an equally long history of responses saying that it cannot be done, should not be done, and that REST is allow about
following your nose across a web of interlinked resources. many approaches have used some kind of RDF-based model, and have either tried to model/describe an application in terms of SemWeb principles, or in some cases the standards (such as HTTP) themselves. however, so far there seems to be no consensus on whether it could/should be done, and given the fact that we expect
RESTful APIs to become more popular, it would be good if the REST community could provide some guidance on how those APIs should be described/documented.
in an attempt to better understand what and why people are trying to accomplish when they try to
describe REST, here is a first approach to more cleanly layer the issues, see what's already around, and try to figure out what value could provided by some format/technology for each layer. we are actively working in this area because we simply need something that will allow service providers to describe (and by that we mostly mean document) their services. our platform allow customers to define and expose REST services, and when they do this and want to point, for example, external partners at these services asking them to develop clients, instructing them to
tell your external partners to follow their nose just doesn't quite cut it.
in this first attempt at a layering model, there are five layers:
- architectural style: this is where REST itself is defined, independent of any technology. this is where the architectural principles of the style are defined, and fielding' thesis by definition is the one and only authoritative source in this layer.
- technology framework: on top of REST's principles, web architecture encompasses a set of enabling mechanisms, most importantly URI for identification, HTTP for interaction, and media types for labeling representations (and it is interesting to notice that the core technologies on this layer are all defined by IETF instead of the W3C). while this layer is essential for the fabric of the web, in itself it does not yet define a working system, because it lacks concrete representations that can be used as transfer formats.
- core technologies: for the human web, HTML is the most important media type (but without more media types such as images, the web never have succeeded), and for machines XML and RDF have become important enabling technologies. it is important to notice that all of these media type are application-agnostic and not application-specific formats. clients need to support those media types to work properly, but without additional guidance (provided by human operators or additional information), they don't know what they are doing and blindly interact with a set of semantically shallow resources. core technologies allow clients to work in a RESTful way, but they may not allow servers to expose the richness of the services they are providing.
- added semantics: (almost) all of the web's core technologies have some mechanism to add semantics. sometimes these mechanisms have formalized representations (such as XML's DTDs or XSD), sometimes not (HTML profiles are not described in any standardized way). some added semantics are metamodels (XML schema languages allow authors to define their own schemas), some are fixed models (podcasts define a fixed set of extensions that can be found in feeds). most importantly, the core technologies provide ways how these added semantics can be signaled to clients, for XML through namespaces and/or associated schemas, and for HTML through profile parameters (and the proposed 'profile' link relation type might be able to unify this mechanism across media types). added semantics allow servers and clients to support semantic overlays that allow richer and more semantics-driven forms of interaction.
- applications: based on core technologies and possibly added semantics, developers build applications, which expose the resources of a given service in some URI space. these applications can use any number of media types and any set of added semantics on top of these, and in some cases developers will define their own media types, whereas in others they will just reuse existing ones. in both cases, the application is seamlessly embedded into the bigger context of the RESTful system, and while that may be not interesting at all for some clients, and might as well be interesting for others. for the HTML web, google's sitemaps format has become an established representation in this area because it allows servers and certain clients (crawlers) to interact in ways that are advantageous for both sides.
based on some of the discussions that happened around
describing REST, it seems that at times, there was a mismatch in where in these layers people wanted to be. when it comes to layers 3 and 4, documentation/description should be the media type or semantic extension itself. for these cases, all that a client should need to know is the processing and operational model of the resource, and then it can act. however, when it comes to layer 5, i am a firm believer that there are scenarios where a description format would be very desirable, and the sitemaps format is one (very widely deployed) example.
in our work on the Resource Linking Language (ReLL) we have explored this area and made some design choices that i would now make differently. but generally speaking, giving application providers the possibility to describe the
extent of their application and doing that in a way which itself is a self-describing resource seems like a very RESTful approach. if a client is interested in this information, there is a discovery mechanism (for sitemaps a slightly awkward
robots.txt-based rule that hardcodes an URI-based approach) and then the client can happily use that information, and could for example inform the user that he is now leaving the application (which may or may not be indicated by a change in the some URI prefix). if clients are not interested in the notion of an application at all (like regular users happily surfing the web without ever experiencing
application boundaries), they simply ignore this information.
starting from this idea of
REST layers, i'd be curious to get some feedback about the perceived usefulness of a description language that specifically just resides in layer 5. the main goal is to establish a format that could be seen as
sitemaps for REST, and definitely should be less chatty than exhaustively listing all URIs. this would allow service providers to describe the
universe of resources they are exposing, and would give clients a snapshot view of the current expectations they can have about exposed resources, and the assumptions about where the scope of the applications ends. in a follow-up to this post, i will outline our scenarios, and write more about the things we would like to include and exclude in such a
REST Application Description Language.