one of the interesting and essential things about REST is that while identity and interaction are based on the same concept (the URI), they are not the same (this is what hypermedia is all about). you can have identity and no interaction, for example when saying that something has been standardized by the standard urn:ietf:rfc:6570
. or you can have interaction and no (stable) identity, for example when referring to http://amazon.com/bestsellingbookoftoday
. REST tells us that we have to be careful and be clear about when URIs are identifiers, and when they are links, so that clients can use URIs in the right way. it's the media type's job to clearly say what are identifiers, and what are links.
for links, it is pretty clear that they have to be actionable, otherwise they make little sense as links. thus an URI scheme with a well-defined uniform interface, such as HTTP, makes a lot of sense. for identifiers, though, there are more options to choose from, because identifiers don't have to be actionable, and thus can use non-actionable URI schemes such as URNs (or you can fake URNs by using tag URIs and some local conventions). using non-actionable identifiers allows you to design them with more robustness against changes, but it removes the convenience of using them as immediately useable links. this consideration becomes particularly important when considering cloud scenarios, where services are made available in ways that should make it easier to reconfigure services and service systems.
looking at the value that many organizations see in the cloud, it often is not (only) the ability to be able to outsource things easily, but almost more importantly the ability to change these decisions easily. agility seems to be what most CIOs really like about the cloud: if it's cheaper to run something hosted somewhere else, then we should be able to do it. but if regulation forces us to run some services on in-house data centers, then we should be able to do that as well. if switching between these configurations requires us to change anything in out service system, then something in the cloud setup is not delivering as promised.
if you're designing services for cloudy environments, this agility makes your job a little trickier. for example, while it's easy to deploy something with cloudfoundry, the standard model means that you'll end up with yourservice.cloudfoundry.com
, which means that if you naively mint URIs based on this DNS name, the identity of your objects will forever be bound to cloudfoundry. different cloud layers have different levels of how easy it is to avoid this.
- IaaS offerings often make it very easy to avoid having to deal with could identity at the identifier level. the cloud shows up at the IP level, and deployments use whatever DNS name you'd like to use. this is safe, but on the other hand, IaaS services are not the ones offering the greatest agility, since you still have to do most of the traditional system adminstration tasks yourself.
- PaaS offerings (such as cloudfoundry) provide greater agility, and you can even install your own cloudfoundry PaaS platform wherever you like. however, you might not always have full control over the DNS level of things (cloudfoundry's hosted service only provides you with a subdomain), so identity may be bound to your PaaS provider, and moving away from that will break all identifiers.
- SaaS offerings are probably even trickier that that because they might not just determine DNS issues, but also impose certain URI patterns, because that's how a specific SaaS service works. minting those URIs might couple you even more tightly to a cloud.
where does this leave users trying to use cloud offerings, but maybe being nervous about identity issues? while URNs may seem like the easy way out, providing a clean separation between identity and identity resolution, it is really hard to design and build systems that use URNs. while you might not need the full complexity of it, you're essentially re-building the DNS. probably on a smaller scale, but with many of the same challenges (and resulting weaknesses, if you're not very careful). URNs make things more complex for clients and for servers, so if possible, they should be avoided.
this leaves us with using URIs. what's essential is that in any RESTful system serving critical business needs, URIs must be persistent. this rules out any URIs where cloud components show up in the URI. while you might rely on a cloud provider for running your services, you always should have a strategy how to change providers, if you have to. if that strategy means breaking all identifiers you have minted previously, it's not a good strategy. also, HTTP redirect do not really help in this case, because then you would need to run a redirect service at the old DNS name, which means you would still critically depend on the old service provider.
what this means is that you must have a model how you can set up the cloud service to serve the URIs you want it to serve, under your authority (i.e., bound to a DNS name that is indepedent of the could provider). as pointed out above, for IaaS this is probably easy to do. for PaaS, it may be harder. for SaaS, it's probably very hard. but it's your business data, and everything in REST revolves around stable and reliable identities, so this is something to look at very carefully.
this also means that even though it's nice to be able to start very simply and easily in some cloud environments, think very carefully about identity issues before putting in the first real data. as soon as any other service depends on this simple proof-of-concept deployment, you will start breaking things somewhere when you redeploy.
to summarize, it seems thinking about identity is worth the effort before jumping into any specific cloud setup. for any persistent resources, make sure identity is not bound to your cloud provider in any way, and if the data is critical business data, invest some effort in exploring migration options, and maybe even run through a migration test case.
all of this naively assumes your resources always migrate as a whole. things get even more interesting when you think that for business reasons (think regulatory requirements) you have to migrate a subset of resources, currently hosted in some cloud, to a different cloud. this is more interesting because now, just changing DNS entries doesn't do the job anymore. this will be the subject of a follow-up to this post, but for now, feel free to share your experience, and where you found certain offerings and setups to be particularly supportive, or problematic.
I'm confused when you say that http://amazon.com/bestsellingbookoftoday is not a (stable) identity. After all, that URL is an identifier, and it identifies a resource. And that resource is thereby the identity.
Since resources are temporarily varying membership functions, I'd say that yes, that URL does identify a stable identity. The fact that the *identity* can correspond to different *entities* at different points in time does not matter—what matters is that it will at any point in time be “the best selling book of today”.
Posted by: Ruben Verborgh | Wednesday, November 28, 2012 at 03:26
@RubenVerborgh maybe i picked a bad example, and everything you write is correct. the point is that the persistent URIs that may be of more business value when building services around books are probably http://amazon.com/book/42 resources, because there will be important services linking to those when it comes to ordering, relating, managing comments and so forth. what i wanted to show was that some identities that an app may expose are not as core to a business as others, so if they go away or have to be moved, the impact on the business is probably less severe. for other resources, though, they are critical for the business and if they change identity, pretty much every business activity is affected.
Posted by: dret | Wednesday, November 28, 2012 at 09:06
That makes a lot of sense, thanks for clarifying!
Posted by: Ruben Verborgh | Wednesday, November 28, 2012 at 09:08