Using Cosmos DB Gremlin API, I’m trying to create a gremlin query that summarizes edges by vertex labels by counts
The closest thing I can come up with doesn’t do the counting just deduping. Any help would be greatly appreciated
g.E().project('edge','in','out').
by(label()).
by(inV().label()).
by(outV().label()).dedup()
output
[
{
"edge": "uses",
"in": "software-system",
"out": "person"
},
{
"edge": "runs on",
"in": "container",
"out": "software-system"
},
{
"edge": "requires",
"in": "component",
"out": "container"
},
{
"edge": "embeds",
"in": "code",
"out": "component"
}
]
ideally output
[
{
"edge": "uses",
"in": "software-system",
"out": "person",
"count": 105
},
{
"edge": "runs on",
"in": "container",
"out": "software-system",
"count": 22
},
{
"edge": "requires",
"in": "component",
"out": "container",
"count": 15
},
{
"edge": "embeds",
"in": "code",
"out": "component",
"count": 6
}
]
I think I would approach it this way with a combination of groupCount()
and project()
:
gremlin> g.E().groupCount().
......1> by(project('edge','in','out').
......2> by(label).
......3> by(inV().label()).
......4> by(outV().label())).
......5> unfold()
==>{edge=created, in=software, out=person}=4
==>{edge=knows, in=person, out=person}=2
If your graph database can't support keys as maps then you might need to transform it further:
gremlin> g.E().groupCount().
......1> by(project('edge','in','out').
......2> by(label).
......3> by(inV().label()).
......4> by(outV().label())).
......5> unfold().
......6> map(union(select(keys), select(values)).fold())
==>[[edge:created,in:software,out:person],4]
==>[[edge:knows,in:person,out:person],2]