gremlin cassandra-3.0 janusgraph tinkerpop3 java-17

Do JanusGraph Cassandra Indexes Need To Be Reindexed?

Do JanusGraph indices need to be reindexed after loading in more data?

I know this is a newbie question, but I am a newbie with on this subject.
From my understanding right now, OrientDB and Neo4j don't; and I think most SQL database servers default to automatically self-govern their own maintenance. But, I just want to make sure I'm testing JanusGraph correctly when I'm checking its performance.

I didn't find JanusGraph Docs telling me it explicitly does-or-does-not upkeep its indexes; Nor did I see JanusGraph showing me sample indexing output and logs. And I especially don't know if the behavior would change from using "inmemory" compared to backend-server Cassandra being used instead. Plus, some examples use Oracle Berkeley DB which leaves me more uncertain about backend-server specific concerns.

import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.schema.JanusGraphManagement;

public class Main {
    public static void main(String[] args) {
        JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
        JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
        PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
        if (!janusGraphManagement.containsGraphIndex("_id"))
            janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
        janusGraphManagement.commit();
        JanusGraphVertex janusGraphVertex = janusGraph.addVertex();
        janusGraphVertex.property("test","test");
        janusGraph.tx().commit();
        janusGraphVertex = janusGraph.addVertex();
        janusGraphVertex.property("test","test2");
        janusGraph.tx().commit();
        janusGraph.close();
    }
}

    <dependencies>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-slf4j2-impl</artifactId>
            <version>2.20.0</version>
        </dependency>
        <dependency>
            <groupId>org.janusgraph</groupId>
            <artifactId>janusgraph-cql</artifactId>
            <version>1.0.0-20230504-014643.988c094</version>
        </dependency>
    </dependencies>

Solution

If you create the keys and labels you want to index in the same transaction as you create the index, there is no need to re-index.

If you need to pickup anything created prior to creating the index, you need to perform the re-index steps.

Once that is done and the transaction committed you should be able to start adding data and write queries that utilize the index(es) you created.

Ending a Gremlin query with .profile() will show you which (if any) index the query runtime used to help execute the query.

The easiest way to experiment with this is to use the inmemory graph and launch it from the Gremlin console using:

gremlin> g=JanusGraphFactory.open('inmemory').traversal()
==>graphtraversalsource[standardjanusgraph[inmemory:[127.0.0.1]], standard]