I have the following information in a Titan Graph database.I am trying to make sense of the information by sending queries across gremlin shell.The Graph database that I am trying to investigate models a Network.There are two types of vertices
- `Switch`
- `Port`
I am trying to figure out the relationship between these two types of vertices.
g = TitanFactory.open("/tmp/cassandra.titan")
To see the list of vertices of each type
$ g.V('type', 'switch')
==>v[228]
==>v[108]
==>v[124]
==>v[92]
==>v[156]
==>v[140]
$ g.V('type', 'port')
==>v[160]
==>v[120152]
==>v[164]
==>v[120156]
==>v[560104]
==>v[680020]
==>v[680040]
==>v[112]
==>v[120164]
==>v[560112]
==>v[680012]
==>v[680004]
==>v[144]
==>v[680032]
==>v[236]
==>v[100]
==>v[560128]
==>v[128]
==>v[680028]
==>v[232]
==>v[96]
To find the relation between the switch and port.
g.v(108).out
==>v[560104]
==>v[680004]
==>v[112]
What is this "out"? As I understand there is a outward arrow pointing from Switch represented by vertex 108
to the Ports represented by vertices 560104
680004
and 112
What is this in
and out
? Is it something very specific to Graph Databases? Also what is a label in a graph databse? Are in
and out
labels?
The use of in
and out
is descriptive of the direction of the edge going from one vertex to another. In your case, you have this:
switch --> port
When you write:
g.v(108).out
you are telling Gremlin to find the vertex at 108
, then walk along edges that point out
or away from it. You might also think of out
as starting from the tail of the arrow and walking to the head. Given your schema, those lead to "ports".
Similarly, in
simply means to have Gremlin walk along edges that point in
to the vertex. You might also think of in
as starting from the head of the arrow and walking to the tail. Given your schema, switches will have no in
edges and hence will always return no results. However if you were to start from a "port" vertex and traverse in
:
g.v(560104).in
you would at least get back vertex 108
as vertex "560104" has at least one edge with an arrow pointing to it (given what I know of your sample data).
By now you've gathered that in
and out
are "directions" and not "labels". A label has a different purpose; it categorizes an edge. For example, you might have the following schema:
switch --connectsTo--> port
company --manufactures--> switch
switch --locatedIn--> rack
In other words you might have three edge labels representing different ways that a "switch" relates to other parts of your schema. In this way your queries can be more descriptive about what you want. Given your previous example and this revised schema you would have to write the following to get the same result you originally showed:
g.v(108).out("connectsTo")
==>v[560104]
==>v[680004]
==>v[112]
Graph databases will typically take advantage of these labels to help improve performance of queries.