The problems with RDF

The first problem with RDF is that most people's reaction to the term is WTF is RDF?

RDF stands for Resource Descriptor Framework, and is a cornerstone of the Semantic Web as envisioned by Tim Berners-Lee. The basic idea of RDF is very simple: it describes a series of linked objects of the form Subject -- Predicate --> Object. For example, John -- owns --> a Cat.

For the last nine years, Tim has been championing XML and RDF as the solution to the Semantic Web problem.

Yet despite the Web becoming so pervasive that it is almost impossible to imagine living without, the Semantic Web remains a distant dream, and RDF a niche technology known about only by a hardcore few. So what has gone wrong?

The second problem with RDF is that the current syntax options are obscure and in the case of RDF/XML, almost impossible to read and write by hand.

RDF currently comes in two major flavours: Notation 3 (commonly abbreviated to N3) and RDF/XML.

While N3 has a slightly simpler syntax and includes basic rule processing, it lacks the common framework of understanding provided by the XML language.

On the other hand, while RDF/XML appears to provide readable XML options for trivial examples, any attempts to model RDF graphs of even slight complexity can quickly become almost impossible to parse by eye:

<rdf:Description rdf:about="http://www.w3.org/TR/rdf-syntax-grammar">
  <ex:editor>
    <rdf:Description>
      <ex:homePage>
        <rdf:Description rdf:about="http://purl.org/net/dajobe/">
        </rdf:Description>
      </ex:homePage>
    </rdf:Description>
  </ex:editor>
</rdf:Description> 

As Tim Bray puts it,

"Where, pray tell, are the resources, properties, and values? What benefit could I expect to derive from viewing this particular source?"

I first encountered RDF -- technically, the RDF/XML dialect -- while attempting to build a dynamic tree for a Mozilla extension listing preferences. (Note that this is back in the pre-1.0 days before Firefox or even Firebird existed; Mozilla is moving away from RDF/XML towards SQLite implementations now.)

I dutifully read the Mozilla XUL tutorial on using RDF/XML to generate templates. Since I didn't really understand it, I went to the W3C's official RDF site and read through the tutorial there. That confused me even more!

Eventually I got a rough handle on the concepts, but I never did get those damn templates working properly, and it literally took me hours before I conceptually realised which tag was a subject, which was a predicate, and which was an object.

The third problem with RDF is that it contains too much jargon.

RDF/XML was written by people with a deep background in and understanding of semantic concepts: graphs, subjects, predicates, objects, statements, conjunctions, universal and existential quantifiers, schemas, ontologies -- the list goes on.

Naturally, any technical language or dialect involves some complexity, but RDF seems to almost wilfully obscure meaning in order to maintain its academic purity.

Unlike the original World Wide Web, which was built by amateurs who could grasp that <B>the B tag</B> meant that text became bold, RDF/XML has no such easy "in". And yet the concept of relating objects through concepts is an intuitive human behavior.

Although the world certainly has enough XML dialects, I think that RDF will remain a rare solution until a language is created which relates the way that humans think to the language's representation. My contribution to the debate is R3: Reasonably Readable RDF.

It was Tim Bray's RDF.net challenge which first got me thinking about this problem. (Some might say that 3 1/2 years is a long time to respond to a challenge!) While the concept of R3 has been floating around in my head a long time, it was only recently that everything came together.

Rhino and its E4X native support for XML helped. So did closely examining N3 and comparing it to RDF/XML, RDFS and OWL to identify where the concepts these languages expressed overlapped.

At this point the spec is only a draft, but it should be usable enough to build real-world implementations. I plan to try an E4X query engine first and see what can be done from there.