metadatardfatom-feedrdfadublin-core

Format to use for exposing structured meta data (dublin core, rdf, atom)?


In an altruistic manner I would like to expose as much structured data about my website. I also wouldn't mind SEO boost but its secondary.

Seems there are a couple of options:

I'm just trying to make it easy for people to get data off my site.

The nice thing about standards is that there are so many of them to choose from.

Which one do you think I should use?


Solution

  • RDF is not just XML; RDF is a data model that relies on sets of triples (subject, predicate, object) and URIs to unambiguously refer to things. Actually, people working with RDF tend to run away from RDF/XML and we prefer RDF/Turtle or RDF/Ntriples, even RDF in JSON format. These serializations are more readable, easier to construct and easier to parse. Moreover, there are many tools that allow you to transform between all the range of RDF flavors (i.e: rapper or Jena).

    When it comes to publishing information in RDF. You generally have three different choices:

    1. To provide RDF dumps of your data.
    2. To publish RDF following the Linked Data rules.
    3. To add metadata to your existing Web pages with RDFa.

    ... these are not exclusive. You can go for any combination of them, the most important thing is choosing the correct structure of URIs (see Cool URIs don't change).

    Following your SO profile I see that you're working on a social taste recommendation website (http://evocatus.com/). I assume that you might want to expose information about those reviews. So for a review like http://evocatus.com/sauce/cholula-chipolte-hot-sauce/272645/ you can provide different serializations and give back not just HTML but also:

    In addition, the HTML version could be enhanced with RDFa. Depending on the type of client that consumes your data, following content negotiation rules, you'll redirect the HTTP request to whichever format is accepted by the client. This is established by the HTTP header Accept. So a request like the one below with curl would be redirected by your application giving back the RDF/XML version:

    curl -H 'Accept: application/rdf+xml' .../holula-chipolte-hot-sauce/272645/
    

    In the future, people would be able to say things about existing reviews in your site by just reusing your URIs in their RDF data. That's the power of RDF and Linked Data.

    About Dublin Core, you could use Dublin Core with either RDF or RDFa. But, in your case there are some other interesting ontologies to consider and the right thing would be to use a mix of all of them:

    There is one site called http://revyu.com/ that uses all these ontologies (except GoodRelations), so you could use it as a guideline. See for instance:

    ... these are HTML and RDF versions of the same review.

    Unlike with ATOM, as you can see, with RDF you would be able to reuse existing ontologies and since RDF is based on URIs everything would be interlinked.

    Linked Data Added Value

    What would happen if you invest sometime linking your products and reviews to other data sources ? (i.e: dbpedia.org or freebase.com). Let's imagine that you start linking all your Beer reviews (http://evocatus.com/beer/) to whatever brewery is manufacturing the product from (http://dbpedia.org/page/Alcoholic_beverage), by following the links you would be able to know for instance where the preferable beers are manufactured. Dbpedia holds that information.

    Also see that in Freebase, that also provides RDF versions, you could link to manufacturers. For instance see, http://rdf.freebase.com/rdf/en.budweiser in RDF or http://www.freebase.com/view/en/budweiser in HTML.