this morning i was contacted by the Linked Data™ Police. i found the following email in my inbox, from a pretty well-established researcher at a pretty well-established institution:
I came across this: http://repositories.cdlib.org/ischool/2009-035/
So you consciously redefine term
linked datato mean something different. Look, the big point of linked data is that it means you are interoperable with a large number of existing data sources available in the same data model and protocol, like Freebase, DBpedia, Geonames, and so forth. I had to explain to colleagues that no, there is nobody working on providing linked data for recovery.gov, and the guys in Berkeley who claim to be working on it are actually just abusing the term as a buzzword in order to get free attention, but actually have no intention of delivering anything that's interoperable with the linked data standards.There are perfectly descriptive terms for the technology you are using, REST and ROA.
You do a disservice to yourself. What you do is intellectually dishonest and unnecessarily antagonizes those who have worked hard to establish a set of interoperable practices around RDF on the web.
Please stop this practice.
so instead of just discussing the question of whether Linked Data requires RDF, you now get a cease and desist letter when your work is about publishing data that is linked and you dare to call it Linked Data. i think the letter mostly speaks for itself, but what really scares me is the upside-down attitude expressed by it. instead of looking at the web at something that is good and helpful and should be improved by adding more semantics, and then trying to figure out the most effective way of doing this, given the constraints of the scenario considered, it starts with a set of technologies you have and want to use, and then claim that whatever you want to do, you have to do it using those technologies.
generally, the approach to use generic problem names to refer to specific technologies is something that only helps to confuse a lot of people, and mostly is an attempt to position these technologies in a way which makes competition harder. my favorite examples for bad labels for specific technologies are Linked Data, Semantic Web, Web Services, and XML Schema. if you build technologies, stick to calling them technologies, don't start calling them solutions. a solution is a technology that is applied to a given problem within the limits given by the problem scenario's constraints, and no technology can predict the problem.
oh, and, just for the record: if you look at the approach we're taking beyond just checking of whether we propose to use RDF directly, RDF is just a stylesheet away. publish your data in a RESTful way that's accessible to all feed users and readers out there, and then writing some GRDDL that will fill a triple store becomes trivial. which kind of demonstrates the point that RDF is just an implementation issue, it should not be the litmus test of whether you follow the True Path Of Linked Data. and the report quoted above actually explicitly explains exactly what we do, and how it compares to the narrower definition of Linked Data:
Recently, the idea of openly accessible data has been promoted under the term oflinked data, with recent recommendations being centered around a very specific choice of technologies and data models (all centered around Semantic Web approaches focusing on RDF for data representation and centralized data storage). While it is possible to use these approaches for building Web applications, our recommendation is to use better established and more widely supported technologies, thereby lowering the barrier-to-entry and choosing a simpler toolset for achieving the same goals as with the more sophisticated technologies envisioned for the Semantic Web. For the remainder of this report, we use the termlinked dataas the general concept of publishing interlinked data representations, without referring to the one specific way of implementing it that is often associated with that term as well.
i think this is about as clear as you can get. and i think a pretty large set of substantial problems have yet to be tackled in the RDF approach to linked data; identity, provenance, dynamic services, granularity, and updates come to mind. so rather than arguing about who is entitled to call linked data Linked Data™, it might be more interesting and enlightening to compare different approaches as solutions for specific problems, and see how well they work.
Sheesh, there is no wrath like that of an academic scorned... Are they serious?
Posted by: Mark Nottingham | Friday, November 20, 2009 at 14:28
Ouch. Your work has been very useful for me in trying to understand the dimensions of what people mean when they say "Linked Data".
Possibly of interest: Ian Davis kicked off a little thread over on SemanticOverflow to see what that community thinks generally about Atom at Linked Data: http://www.semanticoverflow.com/questions/193/can-atom-be-considered-linked-data
Posted by: edsu | Saturday, November 21, 2009 at 04:19
It's OK - I fixed the Wikipedia entry for 'Hyperdata' a few months ago, so that we can all use that instead...
http://en.wikipedia.org/w/index.php?title=Hyperdata&action=history
=0)
Posted by: Duncan-cragg | Saturday, November 21, 2009 at 04:50
Hi. I view this kind of RDF 'evangelism' as pretty damaging to the Semantic Web project. I'd like to talk to the person who sent you that mail; can you put me in touch? [email protected] ...
Posted by: Dan Brickley | Saturday, November 21, 2009 at 06:47
I'm the person in question, and the author of the email.
Erik, I apologise for the tone of my email, and for outrightly demanding something. That was inappropriate and completely missed the mark.
I realise that I should have just written a blog post explaining why I consider broadening the term “linked data” a bad idea, and then maybe respectfully pointing you to it.
So I've written that blog post now, trying to explain my opinion, and also responding to some of the points you made in your post:
http://dowhatimean.net/2009/11/whats-in-a-name
Posted by: Richard Cyganiak | Saturday, November 21, 2009 at 11:35
I kinda disagree with using “linked data” for RDF only; but if someone uses the term ‘REST’ and their application is grossly RPC, I would send a letter with a similar tone to the one you got ;)
(so I’m glad to see recovery.gov is genuinely RESTful)
Posted by: Nicolas | Saturday, November 21, 2009 at 13:28
Hi,
I use RDF and SPARQL a lot, but I don't believe Linked Data starts with a 'set of technologies' (and I doubt other people keen on the idea of Linked Data do either). It's also a bit more specific than simply the idea of openly accessible data.
At core, Linked Data is about the idea of a distributed system for publishing data in such a way that it connects up with other data that uses the same concepts, and can easily be merged and queried. It's about the idea that data becomes more useful the more it is connected with other data. RDF, using HTTP based URIs, seems to be the data representation most optimised for these use cases, which is why everyone else (so far as I've seen) describing their data as Linked Data, uses RDF.
Would you consider writing these stylesheets to make your data *also* available as RDF, so it can be linked to by other data (and hopefully it also does or will link to other concepts)?
Posted by: keithalexander.co.uk | Monday, November 23, 2009 at 01:37
Richard certainly chose his words poorly, but you shouldn't be surprised that people would be offended by your redefining a term like "linked data"; which has both a fairly precise meaning, and a community built up around its implementation. Indeed, reading between the lines here, it seems this may well have been your point; to be provocative. If that's the case, then why the indignation?
Posted by: Bruce | Monday, November 23, 2009 at 07:14
@Bruce: our goal in the report was not to be provocative. but we did want to point out that there is a difference between an objective and an implementation. my main issue with claiming very general terms such as "Linked Data" and "Web Services" and "Semantic Web" for very specific technology choices is that it confuses people. i have spent a lot of time explaning to people that a semantic web does not require the Semantic Web™ choice of technologies, that services on the web do not need Web Services™ technologies, and lately that linking data on the web does not need Linked Data™ technologies. i think confusing the distinction between an objective and a solution is a bad idea, it sells specific technology choices without looking at how appropriate they are for a given scenario. so in the end, our goal was definitely not to be provocative, but to make people aware of the fact that problems and solutions are two different things, and that we propose to use a different solution for the problem of linking data. we stated our reasons, and it is up to those financing such a project to decide which solution they prefer. telling them that there is only one solution means you're a salesman and not a consultant.
Posted by: dret | Monday, November 23, 2009 at 08:36
@keithalexander: sure, if i were paid for any of this i would write these stylesheets and make them available. but we're not paid. and isn't the important part to make data available in a way that's in line with the principles of the web? if the Linked Data community wants to use such a dataset, then there is nobody keeping it from writing those stylesheets. as long as the semantics and linkage of the data are well-defined, that is doable and not overly hard. and btw, we're currently doing research in the area of how to better connect RESTful services and consumers expecting RDF, but that's work in progress and not yet ready for production scenarios.
Posted by: dret | Monday, November 23, 2009 at 08:47
Erik: You say that terms like Semantic Web, Web Services and Linked Data are associated with particular technologies because the proponents of those technologies co-opted a generic term for their specific technologies. That's not true. In each of those cases, they actually *coined* the term in the first place. None of those terms were in use before the proponents of specific technologies coined and popularized them (to the best of my knowledge). That's an important point that you forget.
In the case of Linked Data, the term was coined in 2006. It wasn't before 2008 until anyone outside of the RDF community even noticed the term, and not before 2009 that people outside the RDF field started to try calling their stuff by the same name.
You say: “terminus technicus X doesn't need technologies A, B and C.” I think that it would be more honest—and less confusing—if you said: “The goal of terminus technicus X can also be achieved using other technologies than A, B and C.”
I don't understand why you are so keen on using names for your stuff that have been coined and popularized in association with different technologies. My assumption is that you just want to capitalize on the popularity of those terms, which causes nothing but confusion.
Posted by: Richard Cyganiak | Monday, November 23, 2009 at 10:51
@richard: i disagree with your claim of first coinage and subsequent right for exclusivity. i co-authored a book on linking XML data in 2002 (http://dret.net/netdret/publications#wil01a) which back then focused on XLink. we did not bother to "coin" a term because we focused on the problem and describing what was out there back then in terms of available technologies. when i started giving XML courses in 2000 (http://dret.net/netdret/publications#xml0402), they included a discussion about web semantics and the fact that XML had none built-in. same with web services, the ability to use XML for building APIs, and this was before SOAP even existed (i think XML-RPC was the only thing around back then). what i am saying is that the problems of (lowercased) semantic web, linked data, and web services were around earlier (and tackled in various projects and technologies) than the uppercase solutions, which i think is an important observation.
i do agree that if you "coin" a term you're valueing the name more than when you just casually use it as a generic problem name, which means that you might start using it more consistently, and trying to build a brand around it. and that's all fine, as long as you start building your own brand, instead of using a name of a broad class of problems that have been around for a while, and that other people care about as well, who might not necessarily buy into your branded product. what matters is that the terms were used before they were claimed, and unless we want to go down an eolas path, we have to be careful by how broadly you can claim an area by pointing to a specific implementation, and then claim it as yours.
http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html for example, does not mention implementation-level issues once, and it seemed to me that this is the level of discussion we should engage in: we want our data to be open and accessible, and that's the large-scale objective. how we do it then is an issue of looking at the individual problem at hand, and deciding on the best technology/tool for that specific problem.
Posted by: dret | Monday, November 23, 2009 at 11:23
@richard: it's up to you to decide what you think my motives are. i just know from experience that when we talk to management-level IT people with less knowledge of the details of technologies than you and i have, branding matters a lot and thoroughly confuses people. they develop that mindset of "i want web services, i need SOAP", or "i want linked data, i need RDF". i see it as my job as somebody who cares about problems as well as technologies that this is not the case, and it's actually hard to educate them to the level that "i want linked data, i need RDF" is not a correct statement, whereas "i want Linked Data™, i need RDF" actually is. both you and i understand the difference between those statements, but it's utterly confusing for most people, even in the IT sector. and yes, on a philosophical note, i do refuse the right of anybody to claim words or phrases of normal human language.
Posted by: dret | Monday, November 23, 2009 at 11:35
@richard: the funny thing is that we both accuse each other of "confusing people". you say i am confusing people because i misuse an established brand name, i am saying that this brand name confuses people because it uses a broad name of a problem class for a specific technology for solving that problem in one specific way. can we agree that this is our basic disagreement? if so, i'd be interested in other opinions about these two approaches for "name management".
Posted by: dret | Monday, November 23, 2009 at 11:43
Hey, dret. Richard was not in any way official when he hinted you to rethink your wording. He is some dude making very important work to a community of people working together. Dret, you were well aware about the linked data standard when you wrote tha article and your recovery.org web services. But the article goes on to intentionally rename "linked data" to an abstract concept. Redefining words is always problematic.
so, as we don't have police to tell us what a term means, we have to go to google.
http://www.google.com/search?q="linked+data"
The first ~100 hits on google all literally say: URI, HTTP, RDF, one of them being a wikipedia page. So, whatever the other commenters said above, google says: its URIs+RDF+HTTP. Live with it. Or invent a new term. But don't redefine existing terms.
Trademarks: You don't need to have a trademark (tm) to be annoyed when someone calls an implementation the wrong name. There was an attempt in the 1990s by CERN to trademark the "world wide web" which was not possible at that time because it was already a common term.
So, CERN did not get its trademark, but the community was large enough at that point to enforce that "world wide web" is used properly to label only HTTP conformant servers. People could use the term as intended because they stuck to what Tim Berners Lee said.
The same with "microformats", if I go to the official US trademark database, I won't find microformats there:
http://www.uspto.gov/trademarks/index.jsp
So - neither microformats nor "world wide web" are trademarked. Still, the community knows what they mean, and they rely on webmasters using the terms as intended by the original creators (or, in our case, the dominant group who squats on google).
So - if I find a webpage that says it supports "microformats" or the "world wide web" (both non-trademarked terms) and I don't find the correct metadata and XML formats under the hood, I can't use it and I wasted my time trying to use it. In fact, I will be pissed of at the webadmin who said he/she supports it and in fact does not. Its a natural reaction to be frustrated. Its a matter of expectations which are not met. Unsatisfied expectations lead to frustration.
The next commenter may say "Leo - your example of a bad webadmin is about a webpage, but Wilde, Kansa, and Yee wrote a paper". Now I say: it won't stay in a paper.
We all mindlessly copy-paste text snippets from here to there. The text will be quoted out of context. Everyone knowing how to use google will interpret "linked data" the way defined on wikipedia and google, and not "the special way defined in the paper in Section 5 on page 6 of 11". Expections won't be met, frustration may follow.
Another annoying thing is that the paper accidentially misquotes RDF as "centralized data storage", which is wrong. Taking this argument of "RDF is centralized data storage and sucks, so we need to redefined linked data" is an unfair move in an argumentation - RDF is distributed and the argument is wrong. Of course, getting this wrong also shines a more dim light on the rest of the article.
funnily, if richard wouldn't have found that paper, probably nobody would ever have noticed that "section 5" :-)
so, I hope that helped shine some light on trademarks and standards and why Richard was frustrated and many more people may be frustrated when they point their standards-based linked data browsers to a page that does in fact not support linked data.
greetings from another self-acclaimed member of the secret Linked Data Police ("no, the linked data police does not exist. But we have to take you into custody now for saying it does.")
Posted by: Leo Sauermann | Monday, November 23, 2009 at 12:10
Erik, I think we have two basic disagreements.
First, you say that people shouldn't coin generic names for specific technologies. My response is that you're complaining about something that is simply an inevitable fact of life. People coin generic names for specific technologies ALL THE TIME. Complaining about it is silly. I already gave a long list of successful specific technologies that use generic names. A World Wide Web doesn't require HTTP. A Structured Query Language doesn't require relational databases. An Extensible Markup Language doesn't require pointy brackets. Coinage of generic terms for specific technologies is commonplace, accept it.
Second, you say that trying to establish a generic term as a “brand” for a specific technology is confusing and therefore co-opting the term for other technologies is fair game. My response is that at a certain point, the association between generic term and specific technology becomes a fact of life, and the term is generally interpreted as referring to the specific technology, and hence if one wants to communicate clearly, one has to take that into account. This is not a matter of claiming rights over phrases; it's a matter of communicating clearly, which requires acknowledging the commonly understood meaning of a phrase, even if one doesn't agree with it. Hence my suggested rephrasing: “The goal behind {generic “brand” term} can be achieved in a better way without {specific technology}, by using {your specific technology of choice}.” (Case in point: The motto of this year's Topic Maps conference is “Linked Topic Maps,” which I don't find objectionable at all.)
You say that educating management-level folks about those subtle terminology differences is hard. But hey, that's life. We tried unsuccessfully for years to educate people that Semantic Web should include the *Web* somewhere, not just ontologies and logics. We eventually gave up and rallied around a different term (Linked Data), leaving the “Semantic Web” brand to the OWL folks. Made our life much easier.
Anyway, interesting discussion.
Posted by: Richard Cyganiak | Tuesday, November 24, 2009 at 06:23
You guys disappoint me - how come nobody brought up the best known well defined term which is also a general name: "Free Software"?
I agree with Richard, who has very valid points.
Lets forget the unfortunate tone of the email.
Posted by: Anchakor | Tuesday, November 24, 2009 at 09:57
@Anchakor: i am not quite sure why you are referring to "Free Software" here. assuming that you refer to richard stallman, who is basically mister free software, it is actually interesting to see how he handles it. in each talk/discussion, he starts with defining free software ("free" as in "free speech"), and then starts talking about it. he sets the frame for the duration of the conversation, and if you want to engage in that conversation, you know what the frame is for that conversation. this is actually quite the opposite of trying to set a frame that is global in space and time.
and of course i don't know for sure, but it's hard to imagine richard stallman sitting in his office googling "free software" and sending emails to those who refer to the "free beer" variety of free software. my guess is he just has better things to do.
Posted by: dret | Tuesday, November 24, 2009 at 14:00
Richard certainly chose his words poorly, but you shouldn't be surprised that people would be offended by your redefining a term like "linked data"; which has both a fairly precise meaning, and a community built up around its implementation. Indeed, reading between the lines here, it seems this may well have been your point; to be provocative.
Posted by: hiking tent info | Tuesday, April 06, 2010 at 06:43
HTML did well by being easy to just get on with. I get sad academics if: I get the charset wrong, the XML wrong, the RDF/XML wrong, the namespaces wrong, the semantics wrong.
It's an arse to publish RDF and most people won't bother. We need to provide gateways to make it easy to publish and consume linked data without having to understand it.
eg. foafmatic, Atom, RSS etc.
RDF is certainly the gold standard, but most people shouldn't be expected to produce it. As if they get it wrong they get shouted at by the, er, advocacy types.
Don't get me wrong, some of my best friends are Linked Data community members (as the expression goes)...
Posted by: Christopher Gutteridge | Monday, August 22, 2011 at 07:04