[SOLVED] Neo4j Family Tree Relationship Design

Neo4j Family Tree Relationship Design

I am designing an extended family tree using Neo4j. During the design of the relationships I came up with two approaches:

CREATE (p:Person)-[:PARENT_OF]->(s:Person) CREATE (p:Person)-[:STEPPARENT_OF]->(s:Person) CREATE (p:Person)-[:MARRIED_TO]->(s:Person)

With this approach I am creating different relationships for every case (keep in mind that there will be a lot of cases = a lot of relationships)

CREATE (p:Person)-[r:PARENT_OF {type:'natural'}]->(s:Person) CREATE (p:Person)-[r:PARENT_OF {type:'step'}]->(s:Person) CREATE (p:Person)-[r:SPOUSE_OF {type:'marriage'}]->(s:Person)

With this approach there will be less relationships but the design is a little bit messy.

I would like to know which approach will be better and why?

Solution

You are choosing beetwen fine-grained (:PARENT_OF, :STEPPARENT_OF, :MARRIED_TO) or generic relationships (:PARENT_OF {type:'natural'}, :PARENT_OF {type:'step'}, :SPOUSE_OF {type:'marriage'}).

The book Graph Databases (available for download in the Neo4j site) by By Ian Robinson, Jim Webber, and Emil Eifrém says:

Differentiating by relationship name is the best way of eliminating large swathes of the graph from a traversal. Using one or more property values to decide whether or not to follow a relationship incurs extra I/O the first time those properties are accessed because the properties reside in a separate store file from the relationships (after that, however, they’re cached).

Remember that a graph database model should be built focused on the application needs. That is: it depends basically on what type of queries are you asking to your database.

If you need to evaluate the type of the relationship in your graph transversal queries, probably is a good idea split it into separated relationship types.
Otherwise, keep it as a property of a generic relationship type.