I am trying to understand RDF/XML basics. I have a problem understading data referencing.
For a simple example, let's consider the relationship between Person and Document.
In a relational data model, this would be a simple one to many relationship (Person can have many documents, but a Document can only belong to one Person). So this could be solved by having person_id in the documents table.
But how do I achieve something like this in RDF/XML?
I define my <namespace:Person rdf:about="http://www.foo.com">
with all the attributes and I also have <namespace:Document rdf:about="http://www.bar.com">
, but what is the correct way of saying that a document belongs to a person whose id = x?
RDF is a model that uses triples (a.k.a. RDF statements) to express data. Each statement has a subject, a predicate, and an object. Typically, the predicate expresses the relationship between the subject and the object. A collection of such statements can be thought of as a graph (with the subjects and objects as vertices, and the predicates as edges).
So in your example of Documents and Persons, let's first of all establish what relationship Documents and Persons have. For the sake of example, let's assume that you wish to express that a Document has an author, who is a Person.
If we apply this to a specific Document ex:d1
, and a specific person ex:p1
, we would simply write the following triples to express the relation:
ex:d1 a ex:Document;
ex:hasAuthor ex:p1 .
ex:p1 a ex:Person .
The above is Turtle syntax by the way, an easier to read/write syntax for RDF. See the RDF Primer for details.
In RDF/XML syntax, the same data would look something like this:
<ex:Document rdf:about="http://example.org/d1">
<ex:hasAuthor rdf:resource="http://example.org/p1"/>
</ex:Document>
<ex:Person rdf:about="http://example.org/p1"/>
But, like I mentioned in my comment: it's more effective to try and understand RDF modeling in the abstract (thinking about triples and graphs), than by trying to understand how to write RDF/XML.
Back to the example: the above shows how you model a relationship between a specific document and a specific person. If you wish to express the more general information that "documents and persons are classes that can be related through an author-relation", you can use the RDF Schema vocabulary. You would express this as follows:
ex:Document a rdfs:Class .
ex:Person a rdfs:Class .
ex:hasAuthor a rdf:Property ;
rdfs:domain ex:Document ;
rdfs:range ex:Person .
Note that an RDF Schema not the same thing as relational schema! A relational schema's purpose is to prescribe structure and allow data validation. An RDF vocabulary (or ontology) is used to describe the world. All the above says is that documents and persons exist in our world, and if two things have an 'author'-relation between them, those two things are documents and persons.