What are the next steps to complete JanusGraph indices?
I've already tried following line-by-line of the JanusGraph docs;
And tried using closer to source commits janusGraph
instead of janusGraphManagement
;
And tried suggestions provided so far; which none have worked yet.
[o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
Tried to recreate what I initially made, and a part of the suggestion to revisit JanusGraph docs.
The issue with this one is that logs said I didn't have indexes still.
This is what initially got me checking if I was needing janusGraph.commit()
instead.
...
2023-05-15 09:58:05,984 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-15T14:58:05.984686Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@48b4a043
2023-05-15 09:58:06,047 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-15 09:58:06,270 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:58:06,275 [INFO] [Main.main] :: drop g.V().hasLabel("entity").count().next(): 0
2023-05-15 09:58:07,020 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:58:07,033 [INFO] [Main.main] :: addVertex g.V().hasLabel("entity").count().next(): 1
2023-05-15 09:58:07,046 [INFO] [o.j.g.d.m.GraphIndexStatusWatcher.main] :: Some key(s) on index _id do not currently have status(es) [REGISTERED]: _id=ENABLED
...
[o.j.g.d.m.GraphIndexStatusWatcher.main] :: Some key(s) on index _id do not currently have status(es) [REGISTERED]: _id=ENABLED
2023-05-15 09:59:07,520 [INFO] [o.j.g.d.m.GraphIndexStatusWatcher.main] :: Timed out (PT1M) while waiting for index _id to converge on status(es) [REGISTERED]
2023-05-15 09:59:07,522 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:59:07,531 [INFO] [Main.main] :: awaitGraphIndexStatus g.V().hasLabel("entity").count().next(): 1
2023-05-15 09:59:07,627 [INFO] [o.j.g.o.j.IndexRepairJob.Thread-68] :: Index _id metrics: success-tx: 2 doc-updates: 0 succeeded: 1
...
2023-05-15 09:59:08,297 [INFO] [o.j.g.d.m.ManagementSystem.Thread-51] :: Index update job successful for [_id]
2023-05-15 09:59:08,299 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 09:59:08,304 [INFO] [Main.main] :: updateIndex g.V().hasLabel("entity").count().next(): 1
Process finished with exit code 0
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.schema.JanusGraphIndex;
import org.janusgraph.core.schema.JanusGraphManagement;
import org.janusgraph.core.schema.SchemaAction;
import org.janusgraph.graphdb.database.management.GraphIndexStatusReport;
import org.janusgraph.graphdb.database.management.ManagementSystem;
import java.util.concurrent.ExecutionException;
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
if (!janusGraphManagement.containsGraphIndex("_id"))
janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
janusGraphManagement.commit();
janusGraphManagement = janusGraph.openManagement();
GraphIndexStatusReport report = ManagementSystem.awaitGraphIndexStatus(janusGraph, "_id").call();
logger.info("awaitGraphIndexStatus g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphIndex test = janusGraphManagement.getGraphIndex("_id");
janusGraphManagement.updateIndex(test, SchemaAction.REINDEX).get();
janusGraphManagement.commit();
logger.info("updateIndex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
JanusGraph is telling me I cannot enable index because its installed.
Which JanusGraph should be telling me its registered instead.
Which shouldn't even matter if I'm just doing a basic awaitGraphIndexStatus()
like the example.
Which wouldn't matter if I knew how to create the index before the vertices existed.
2023-05-12 12:50:04,641 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-12 12:50:04,737 [INFO] [c.d.o.d.i.c.DefaultMavenCoordinates.main] :: DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.15.0
2023-05-12 12:50:05,236 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-12 12:50:05,518 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=4fe054a)=null; please provide the correct local DC, or check your contact points
2023-05-12 12:50:05,763 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] :: Generated unique-instance-id=c0a8563c21084-rmt-lap-win201
2023-05-12 12:50:05,786 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-12 12:50:05,823 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-12 12:50:05,861 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=5ac9b054)=null; please provide the correct local DC, or check your contact points
2023-05-12 12:50:05,880 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] :: Initiated fixed thread pool of size 40
2023-05-12 12:50:05,998 [INFO] [o.j.g.d.StandardJanusGraph.main] :: Gremlin script evaluation is disabled
2023-05-12 12:50:06,026 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-12T17:50:06.025476Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@48b4a043
2023-05-12 12:50:06,086 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:06,344 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:06,353 [INFO] [Main.main] :: drop g.V().count().next(): 0
2023-05-12 12:50:07,107 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:07,110 [INFO] [Main.main] :: addVertex g.V().count().next(): 1
2023-05-12 12:50:08,197 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:08,202 [INFO] [Main.main] :: buildIndex g.V().count().next(): 1
2023-05-12 12:50:08,209 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-12 12:50:08,212 [INFO] [Main.main] :: updateIndex g.V().count().next(): 1
Exception in thread "main" java.lang.IllegalArgumentException: Update action [ENABLE_INDEX] cannot be invoked for index with status [INSTALLED]
at org.janusgraph.core.schema.SchemaAction.isApplicableStatus(SchemaAction.java:85)
at org.janusgraph.graphdb.database.management.ManagementSystem.updateIndex(ManagementSystem.java:864)
at org.janusgraph.graphdb.database.management.ManagementSystem.updateIndex(ManagementSystem.java:845)
at Test12.main(Test12.java:39)
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("buildIndex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraphManagement.updateIndex(janusGraphManagement.getGraphIndex("_id"), SchemaAction.REGISTER_INDEX).get();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("REGISTER_INDEX g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraphManagement.updateIndex(janusGraphManagement.getGraphIndex("_id"), SchemaAction.ENABLE_INDEX).get();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("ENABLE_INDEX g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
ManagementSystem.awaitGraphIndexStatus(janusGraph, "_id").call();
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("awaitGraphIndexStatus g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
Suggestion to revisit JanusGraph docs, I no longer think the output was a misnomer.
Cleaned up to what it is now.
2023-05-15 11:02:15,142 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-15 11:02:15,230 [INFO] [c.d.o.d.i.c.DefaultMavenCoordinates.main] :: DataStax Java driver for Apache Cassandra(R) (com.datastax.oss:java-driver-core) version 4.15.0
2023-05-15 11:02:15,853 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-15 11:02:16,152 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=705b0e14)=null; please provide the correct local DC, or check your contact points
2023-05-15 11:02:16,410 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] :: Generated unique-instance-id=c0a856493416-rmt-lap-win201
2023-05-15 11:02:16,433 [INFO] [c.d.o.d.i.c.ContactPoints.main] :: Contact point localhost:9042 resolves to multiple addresses, will use them all ([localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1])
2023-05-15 11:02:16,472 [INFO] [c.d.o.d.i.c.t.Clock.JanusGraph Session-admin-0] :: Using native clock for microsecond precision
2023-05-15 11:02:16,522 [WARN] [c.d.o.d.i.c.l.h.OptionalLocalDcHelper.JanusGraph Session-admin-0] :: [JanusGraph Session|default] You specified datacenter1 as the local DC, but some contact points are from a different DC: Node(endPoint=localhost/127.0.0.1:9042, hostId=null, hashCode=1b473c95)=null; please provide the correct local DC, or check your contact points
2023-05-15 11:02:16,548 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] :: Initiated fixed thread pool of size 40
2023-05-15 11:02:16,678 [INFO] [o.j.g.d.StandardJanusGraph.main] :: Gremlin script evaluation is disabled
2023-05-15 11:02:16,704 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-15T16:02:16.704454Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@74ea46e2
2023-05-15 11:02:16,762 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-15 11:02:16,989 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:16,992 [INFO] [Main.main] :: drop g.V().hasLabel("entity").count().next(): 0
2023-05-15 11:02:17,010 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,014 [INFO] [Main.main] :: makePropertyKey g.V().hasLabel("entity").count().next(): 0
2023-05-15 11:02:17,748 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,759 [INFO] [Main.main] :: addVertex g.V().hasLabel("entity").count().next(): 1
2023-05-15 11:02:17,761 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 11:02:17,764 [INFO] [Main.main] :: updateIndex g.V().hasLabel("entity").count().next(): 1
Process finished with exit code 0
public class Test {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
if (!janusGraphManagement.containsPropertyKey("_id"))
janusGraphManagement.makePropertyKey("_id").dataType(String.class).make();
janusGraphManagement.commit();
logger.info("makePropertyKey g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
janusGraph.tx().open();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
janusGraph.close();
}
}
Final issue was understanding that propertyKey
and its value
have to be supplied.
Thanks again Stephen Mallette!
He provided a great resource from Jason Plurad if you extrapolate more of what you read.
2023-05-15 15:33:56,415 [INFO] [o.j.d.c.b.ReadConfigurationBuilder.main] :: Set default timestamp provider MICRO
2023-05-15 15:33:56,428 [INFO] [o.j.g.i.UniqueInstanceIdRetriever.main] :: Generated unique-instance-id=c0a8564917832-rmt-lap-win201
2023-05-15 15:33:56,440 [INFO] [o.j.d.c.ExecutorServiceBuilder.main] :: Initiated fixed thread pool of size 40
2023-05-15 15:33:56,488 [INFO] [o.j.g.d.StandardJanusGraph.main] :: Gremlin script evaluation is disabled
2023-05-15 15:33:56,494 [INFO] [o.j.d.l.k.KCVSLog.main] :: Loaded unidentified ReadMarker start time 2023-05-15T20:33:56.494323Z into org.janusgraph.diskstorage.log.kcvs.KCVSLog$MessagePuller@54e7391d
2023-05-15 15:33:56,576 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[]]. For better performance, use indexes
2023-05-15 15:33:56,599 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 15:33:56,601 [INFO] [Main.main] :: drop g.V().hasLabel("entity").count().next(): 0
2023-05-15 15:33:56,805 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 15:33:56,806 [INFO] [Main.main] :: REGISTER_INDEX g.V().hasLabel("entity").count().next(): 0
2023-05-15 15:33:56,961 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 15:33:56,971 [INFO] [Main.main] :: addVertex g.V().hasLabel("entity").count().next(): 1
2023-05-15 15:33:56,972 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity]]. For better performance, use indexes
2023-05-15 15:33:56,973 [INFO] [Main.main] :: g.V().hasLabel("entity").count().next(): 1
2023-05-15 15:33:56,991 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[_id <> null]]. For better performance, use indexes
2023-05-15 15:33:56,998 [INFO] [Main.main] :: g.V().hasLabel("entity").count().next(): 1
2023-05-15 15:33:56,998 [WARN] [o.j.g.t.StandardJanusGraphTx.main] :: Query requires iterating over all vertices [[~label = entity, _id <> null]]. For better performance, use indexes
2023-05-15 15:33:56,999 [INFO] [Main.main] :: g.V().hasLabel("entity").count().next(): 1
2023-05-15 15:33:57,000 [INFO] [Main.main] :: g.V().hasLabel("entity").count().next(): 1
Process finished with exit code 0
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
import org.apache.tinkerpop.gremlin.structure.Vertex;
import org.janusgraph.core.JanusGraph;
import org.janusgraph.core.JanusGraphFactory;
import org.janusgraph.core.JanusGraphVertex;
import org.janusgraph.core.PropertyKey;
import org.janusgraph.core.schema.JanusGraphManagement;
import java.util.concurrent.ExecutionException;
public class Test16 {
private static final Logger logger = LogManager.getLogger(Main.class);
public static void main(String[] args) throws InterruptedException, ExecutionException {
//JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "inmemory").open();
JanusGraph janusGraph = JanusGraphFactory.build().set("storage.backend", "cql").set("storage.hostname", "localhost:9042").open();
GraphTraversalSource g = janusGraph.traversal();
g.V().drop().iterate();
janusGraph.tx().commit();
logger.info("drop g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphManagement janusGraphManagement = janusGraph.openManagement();
PropertyKey propertyKey = janusGraphManagement.getOrCreatePropertyKey("_id");
if (!janusGraphManagement.containsGraphIndex("_id"))
janusGraphManagement.buildIndex("_id", Vertex.class).addKey(propertyKey).buildCompositeIndex();
janusGraphManagement.commit();
logger.info("REGISTER_INDEX g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
JanusGraphVertex vertex = janusGraph.addVertex("entity");
vertex.property("_id", "Test1");
janusGraph.tx().commit();
logger.info("addVertex g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
logger.info("g.V().hasLabel(\"entity\").count().next():\t" + g.V().hasLabel("entity").count().next());
logger.info("g.V().has(\"_id\").count().next();:\t" + g.V().has("_id").count().next());
logger.info("g.V().hasLabel(\"entity\").has(\"_id\").count().next():\t" + g.V().hasLabel("entity").has("_id").count().next());
logger.info("g.V().hasLabel(\"entity\").has(\"_id\", \"Test1\").count().next():\t" + g.V().hasLabel("entity").has("_id", "Test1").count().next());
janusGraph.close();
}
}
Switched back to inmemory
like Kelvin Lawrence suggested earlier, since I'm not using large memory to get this concept down.