« Information Engineering Elevator Pitch | Main | Mavic Ksyrium ES Cracked Rim »

Friday, November 20, 2009

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Mark Nottingham

Sheesh, there is no wrath like that of an academic scorned... Are they serious?

edsu

Ouch. Your work has been very useful for me in trying to understand the dimensions of what people mean when they say "Linked Data".

Possibly of interest: Ian Davis kicked off a little thread over on SemanticOverflow to see what that community thinks generally about Atom at Linked Data: http://www.semanticoverflow.com/questions/193/can-atom-be-considered-linked-data

Duncan-cragg

It's OK - I fixed the Wikipedia entry for 'Hyperdata' a few months ago, so that we can all use that instead...

http://en.wikipedia.org/w/index.php?title=Hyperdata&action=history

=0)

Dan Brickley

Hi. I view this kind of RDF 'evangelism' as pretty damaging to the Semantic Web project. I'd like to talk to the person who sent you that mail; can you put me in touch? danbri@danbri.org ...

Richard Cyganiak

I'm the person in question, and the author of the email.

Erik, I apologise for the tone of my email, and for outrightly demanding something. That was inappropriate and completely missed the mark.

I realise that I should have just written a blog post explaining why I consider broadening the term “linked data” a bad idea, and then maybe respectfully pointing you to it.

So I've written that blog post now, trying to explain my opinion, and also responding to some of the points you made in your post:
http://dowhatimean.net/2009/11/whats-in-a-name

Nicolas

I kinda disagree with using “linked data” for RDF only; but if someone uses the term ‘REST’ and their application is grossly RPC, I would send a letter with a similar tone to the one you got ;)

(so I’m glad to see recovery.gov is genuinely RESTful)

keithalexander.co.uk

Hi,
I use RDF and SPARQL a lot, but I don't believe Linked Data starts with a 'set of technologies' (and I doubt other people keen on the idea of Linked Data do either). It's also a bit more specific than simply the idea of openly accessible data.

At core, Linked Data is about the idea of a distributed system for publishing data in such a way that it connects up with other data that uses the same concepts, and can easily be merged and queried. It's about the idea that data becomes more useful the more it is connected with other data. RDF, using HTTP based URIs, seems to be the data representation most optimised for these use cases, which is why everyone else (so far as I've seen) describing their data as Linked Data, uses RDF.

Would you consider writing these stylesheets to make your data *also* available as RDF, so it can be linked to by other data (and hopefully it also does or will link to other concepts)?

Bruce

Richard certainly chose his words poorly, but you shouldn't be surprised that people would be offended by your redefining a term like "linked data"; which has both a fairly precise meaning, and a community built up around its implementation. Indeed, reading between the lines here, it seems this may well have been your point; to be provocative. If that's the case, then why the indignation?

dret

@Bruce: our goal in the report was not to be provocative. but we did want to point out that there is a difference between an objective and an implementation. my main issue with claiming very general terms such as "Linked Data" and "Web Services" and "Semantic Web" for very specific technology choices is that it confuses people. i have spent a lot of time explaning to people that a semantic web does not require the Semantic Web™ choice of technologies, that services on the web do not need Web Services™ technologies, and lately that linking data on the web does not need Linked Data™ technologies. i think confusing the distinction between an objective and a solution is a bad idea, it sells specific technology choices without looking at how appropriate they are for a given scenario. so in the end, our goal was definitely not to be provocative, but to make people aware of the fact that problems and solutions are two different things, and that we propose to use a different solution for the problem of linking data. we stated our reasons, and it is up to those financing such a project to decide which solution they prefer. telling them that there is only one solution means you're a salesman and not a consultant.

dret

@keithalexander: sure, if i were paid for any of this i would write these stylesheets and make them available. but we're not paid. and isn't the important part to make data available in a way that's in line with the principles of the web? if the Linked Data community wants to use such a dataset, then there is nobody keeping it from writing those stylesheets. as long as the semantics and linkage of the data are well-defined, that is doable and not overly hard. and btw, we're currently doing research in the area of how to better connect RESTful services and consumers expecting RDF, but that's work in progress and not yet ready for production scenarios.

Richard Cyganiak

Erik: You say that terms like Semantic Web, Web Services and Linked Data are associated with particular technologies because the proponents of those technologies co-opted a generic term for their specific technologies. That's not true. In each of those cases, they actually *coined* the term in the first place. None of those terms were in use before the proponents of specific technologies coined and popularized them (to the best of my knowledge). That's an important point that you forget.

In the case of Linked Data, the term was coined in 2006. It wasn't before 2008 until anyone outside of the RDF community even noticed the term, and not before 2009 that people outside the RDF field started to try calling their stuff by the same name.

You say: “terminus technicus X doesn't need technologies A, B and C.” I think that it would be more honest—and less confusing—if you said: “The goal of terminus technicus X can also be achieved using other technologies than A, B and C.”

I don't understand why you are so keen on using names for your stuff that have been coined and popularized in association with different technologies. My assumption is that you just want to capitalize on the popularity of those terms, which causes nothing but confusion.

dret

@richard: i disagree with your claim of first coinage and subsequent right for exclusivity. i co-authored a book on linking XML data in 2002 (http://dret.net/netdret/publications#wil01a) which back then focused on XLink. we did not bother to "coin" a term because we focused on the problem and describing what was out there back then in terms of available technologies. when i started giving XML courses in 2000 (http://dret.net/netdret/publications#xml0402), they included a discussion about web semantics and the fact that XML had none built-in. same with web services, the ability to use XML for building APIs, and this was before SOAP even existed (i think XML-RPC was the only thing around back then). what i am saying is that the problems of (lowercased) semantic web, linked data, and web services were around earlier (and tackled in various projects and technologies) than the uppercase solutions, which i think is an important observation.

i do agree that if you "coin" a term you're valueing the name more than when you just casually use it as a generic problem name, which means that you might start using it more consistently, and trying to build a brand around it. and that's all fine, as long as you start building your own brand, instead of using a name of a broad class of problems that have been around for a while, and that other people care about as well, who might not necessarily buy into your branded product. what matters is that the terms were used before they were claimed, and unless we want to go down an eolas path, we have to be careful by how broadly you can claim an area by pointing to a specific implementation, and then claim it as yours.

http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html for example, does not mention implementation-level issues once, and it seemed to me that this is the level of discussion we should engage in: we want our data to be open and accessible, and that's the large-scale objective. how we do it then is an issue of looking at the individual problem at hand, and deciding on the best technology/tool for that specific problem.

dret

@richard: it's up to you to decide what you think my motives are. i just know from experience that when we talk to management-level IT people with less knowledge of the details of technologies than you and i have, branding matters a lot and thoroughly confuses people. they develop that mindset of "i want web services, i need SOAP", or "i want linked data, i need RDF". i see it as my job as somebody who cares about problems as well as technologies that this is not the case, and it's actually hard to educate them to the level that "i want linked data, i need RDF" is not a correct statement, whereas "i want Linked Data™, i need RDF" actually is. both you and i understand the difference between those statements, but it's utterly confusing for most people, even in the IT sector. and yes, on a philosophical note, i do refuse the right of anybody to claim words or phrases of normal human language.

dret

@richard: the funny thing is that we both accuse each other of "confusing people". you say i am confusing people because i misuse an established brand name, i am saying that this brand name confuses people because it uses a broad name of a problem class for a specific technology for solving that problem in one specific way. can we agree that this is our basic disagreement? if so, i'd be interested in other opinions about these two approaches for "name management".

Leo Sauermann

Hey, dret. Richard was not in any way official when he hinted you to rethink your wording. He is some dude making very important work to a community of people working together. Dret, you were well aware about the linked data standard when you wrote tha article and your recovery.org web services. But the article goes on to intentionally rename "linked data" to an abstract concept. Redefining words is always problematic.

so, as we don't have police to tell us what a term means, we have to go to google.

http://www.google.com/search?q="linked+data"

The first ~100 hits on google all literally say: URI, HTTP, RDF, one of them being a wikipedia page. So, whatever the other commenters said above, google says: its URIs+RDF+HTTP. Live with it. Or invent a new term. But don't redefine existing terms.

Trademarks: You don't need to have a trademark (tm) to be annoyed when someone calls an implementation the wrong name. There was an attempt in the 1990s by CERN to trademark the "world wide web" which was not possible at that time because it was already a common term.
So, CERN did not get its trademark, but the community was large enough at that point to enforce that "world wide web" is used properly to label only HTTP conformant servers. People could use the term as intended because they stuck to what Tim Berners Lee said.
The same with "microformats", if I go to the official US trademark database, I won't find microformats there:

http://www.uspto.gov/trademarks/index.jsp

So - neither microformats nor "world wide web" are trademarked. Still, the community knows what they mean, and they rely on webmasters using the terms as intended by the original creators (or, in our case, the dominant group who squats on google).

So - if I find a webpage that says it supports "microformats" or the "world wide web" (both non-trademarked terms) and I don't find the correct metadata and XML formats under the hood, I can't use it and I wasted my time trying to use it. In fact, I will be pissed of at the webadmin who said he/she supports it and in fact does not. Its a natural reaction to be frustrated. Its a matter of expectations which are not met. Unsatisfied expectations lead to frustration.

The next commenter may say "Leo - your example of a bad webadmin is about a webpage, but Wilde, Kansa, and Yee wrote a paper". Now I say: it won't stay in a paper.

We all mindlessly copy-paste text snippets from here to there. The text will be quoted out of context. Everyone knowing how to use google will interpret "linked data" the way defined on wikipedia and google, and not "the special way defined in the paper in Section 5 on page 6 of 11". Expections won't be met, frustration may follow.

Another annoying thing is that the paper accidentially misquotes RDF as "centralized data storage", which is wrong. Taking this argument of "RDF is centralized data storage and sucks, so we need to redefined linked data" is an unfair move in an argumentation - RDF is distributed and the argument is wrong. Of course, getting this wrong also shines a more dim light on the rest of the article.

funnily, if richard wouldn't have found that paper, probably nobody would ever have noticed that "section 5" :-)

so, I hope that helped shine some light on trademarks and standards and why Richard was frustrated and many more people may be frustrated when they point their standards-based linked data browsers to a page that does in fact not support linked data.

greetings from another self-acclaimed member of the secret Linked Data Police ("no, the linked data police does not exist. But we have to take you into custody now for saying it does.")

Richard Cyganiak

Erik, I think we have two basic disagreements.

First, you say that people shouldn't coin generic names for specific technologies. My response is that you're complaining about something that is simply an inevitable fact of life. People coin generic names for specific technologies ALL THE TIME. Complaining about it is silly. I already gave a long list of successful specific technologies that use generic names. A World Wide Web doesn't require HTTP. A Structured Query Language doesn't require relational databases. An Extensible Markup Language doesn't require pointy brackets. Coinage of generic terms for specific technologies is commonplace, accept it.

Second, you say that trying to establish a generic term as a “brand” for a specific technology is confusing and therefore co-opting the term for other technologies is fair game. My response is that at a certain point, the association between generic term and specific technology becomes a fact of life, and the term is generally interpreted as referring to the specific technology, and hence if one wants to communicate clearly, one has to take that into account. This is not a matter of claiming rights over phrases; it's a matter of communicating clearly, which requires acknowledging the commonly understood meaning of a phrase, even if one doesn't agree with it. Hence my suggested rephrasing: “The goal behind {generic “brand” term} can be achieved in a better way without {specific technology}, by using {your specific technology of choice}.” (Case in point: The motto of this year's Topic Maps conference is “Linked Topic Maps,” which I don't find objectionable at all.)

You say that educating management-level folks about those subtle terminology differences is hard. But hey, that's life. We tried unsuccessfully for years to educate people that Semantic Web should include the *Web* somewhere, not just ontologies and logics. We eventually gave up and rallied around a different term (Linked Data), leaving the “Semantic Web” brand to the OWL folks. Made our life much easier.

Anyway, interesting discussion.

Anchakor

You guys disappoint me - how come nobody brought up the best known well defined term which is also a general name: "Free Software"?

I agree with Richard, who has very valid points.
Lets forget the unfortunate tone of the email.

dret

@Anchakor: i am not quite sure why you are referring to "Free Software" here. assuming that you refer to richard stallman, who is basically mister free software, it is actually interesting to see how he handles it. in each talk/discussion, he starts with defining free software ("free" as in "free speech"), and then starts talking about it. he sets the frame for the duration of the conversation, and if you want to engage in that conversation, you know what the frame is for that conversation. this is actually quite the opposite of trying to set a frame that is global in space and time.

and of course i don't know for sure, but it's hard to imagine richard stallman sitting in his office googling "free software" and sending emails to those who refer to the "free beer" variety of free software. my guess is he just has better things to do.

hiking tent info

Richard certainly chose his words poorly, but you shouldn't be surprised that people would be offended by your redefining a term like "linked data"; which has both a fairly precise meaning, and a community built up around its implementation. Indeed, reading between the lines here, it seems this may well have been your point; to be provocative.

Christopher Gutteridge

HTML did well by being easy to just get on with. I get sad academics if: I get the charset wrong, the XML wrong, the RDF/XML wrong, the namespaces wrong, the semantics wrong.

It's an arse to publish RDF and most people won't bother. We need to provide gateways to make it easy to publish and consume linked data without having to understand it.

eg. foafmatic, Atom, RSS etc.

RDF is certainly the gold standard, but most people shouldn't be expected to produce it. As if they get it wrong they get shouted at by the, er, advocacy types.

Don't get me wrong, some of my best friends are Linked Data community members (as the expression goes)...

The comments to this entry are closed.

Flickr