gremlingremlin-serverazure-cosmosdb-gremlinapigremlinpython

Convert Gremlin declarative match query to its imperative counterpart


I have a graph consisting of 4 types of nodes and 4 types of edges as shown below.

(c)-[affecting]->(p)
(c)-[found_in_release]->(r)
(p)-[found_in_release]->(r)
(s)-[severity]->(c)

I had initially written my queries (in a declarative fashion) using Match() and tested them with the Gremlin console. To my surprise, I found out that Cosmos Gremlin API does not support the Match() step and now I have to convert the declarative match() query below to its imperative counterpart

    g.V().match(
            __.as('c').out('affecting').as('p'), \
            __.as('c').out('cve_found_in_release').as('r'), \
            __.as('p').out('pack_found_in_release').as('r'), \
            __.as('s').both('severity').as('c') \
    ). \
    select('c', 'p', 'r', 's').limit(10)

What I figured I could do is transform the match() step into 2 traversals as shown below:

# (c)-[affecting]->(p)-[pack_found_in_release]->(r)
"g.V().hasLabel('cve').as('c').out('affecting').as('p').out('pack_found_in_release').as('r').select('c', 'p', 'r')

# (s)-[severity]->(c)
"g.V().hasLabel('cve').as('s').out('severity').as('r').select('s' 'r')

And then merge the results from these two queries.

However, I am wondering if there is a better way to do the pattern matching query in a way that's supported by Cosmos API (basically without the match step) Any insight would be appreciated


Solution

  • If I understand your usecase well, you search for subgraphs where a CVE is linked to a Severity, to a Package and to a Release. Gremlin is a fairly rich language, so I guess there will be multiple ways to do this. The first one that worked for me, is given below, using the Graph of the Gods of JanusGraph.

    graph = JanusGraphFactory.open('conf/janusgraph-inmemory.properties')
    GraphOfTheGodsFactory.loadWithoutMixedIndex(graph,true)
    g = graph.traversal()
    g.V().as('v').and(
        __.out('lives'),
        __.out('pet'),
        __.out('brother')
    ).project('v', 'l', 'p', 'b').
        by(values('name')).
        by(out('lives').values('name')).
        by(out('pet').values('name')).
        by(out('brother').values('name'))
    
    12:15:11 WARN  org.janusgraph.graphdb.transaction.StandardJanusGraphTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
    ==>[v:pluto,l:tartarus,p:cerberus,b:jupiter]
    

    The and() step does the subgraph selection, which would be needed if your graph has CVE's that do not constitute a complete subgraph. The project() step gathers the required information elements per CVE subgraph.

    Using these patterns, you can add the additional condition that the CVE and Package are related to the same Release, if needed.