We have a PIO 0.11.0 instance running, and we are attempting to use the UR engine, version 0.6.0 (https://github.com/actionml/universal-recommender). We have loaded the eventserver up with our training data, and when we run pio train
the following error is produced:
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.actionml.DataSource@d36c1c3
[INFO] [Engine$] Preparator: com.actionml.Preparator@437281c5
[INFO] [Engine$] AlgorithmList: List(com.actionml.URAlgorithm@27da994b)
[INFO] [Engine$] Data sanity check is on.
[WARN] [TableInputFormatBase] Cannot resolve the host name for ACPIOTest/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[INFO] [DataSource] Received events List()
[WARN] [TableInputFormatBase] Cannot resolve the host name for ACPIOTest/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[INFO] [Engine$] com.actionml.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] com.actionml.PreparedData does not support data sanity check. Skipping check.
[INFO] [URAlgorithm] Actions read now creating correlators
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0
at org.apache.mahout.math.cf.SimilarityAnalysis$.cooccurrencesIDSs(SimilarityAnalysis.scala:145)
at com.actionml.URAlgorithm.calcAll(URAlgorithm.scala:311)
at com.actionml.URAlgorithm.train(URAlgorithm.scala:285)
at com.actionml.URAlgorithm.train(URAlgorithm.scala:175)
at org.apache.predictionio.controller.P2LAlgorithm.trainBase(P2LAlgorithm.scala:49)
at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:692)
at org.apache.predictionio.controller.Engine$$anonfun$18.apply(Engine.scala:692)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.apache.predictionio.controller.Engine$.train(Engine.scala:692)
at org.apache.predictionio.controller.Engine.train(Engine.scala:177)
at org.apache.predictionio.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:67)
at org.apache.predictionio.workflow.CreateWorkflow$.main(CreateWorkflow.scala:250)
at org.apache.predictionio.workflow.CreateWorkflow.main(CreateWorkflow.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Our engine.json is as follows:
{
"comment":" This config file uses default settings for all but the required values see README.md for docs",
"id": "default",
"description": "Default settings",
"engineFactory": "com.actionml.RecommendationEngine",
"datasource": {
"params" : {
"name": "sample-handmade-data.txt",
"appName": "ACRec",
"eventNames": ["purchase", "view"]
}
},
"sparkConf": {
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.kryo.registrator": "org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator",
"spark.kryo.referenceTracking": "false",
"spark.kryoserializer.buffer.mb": "300",
"spark.kryoserializer.buffer": "300m",
"es.index.auto.create": "true"
},
"algorithms": [
{
"comment": "simplest setup where all values are default, popularity based backfill, must add eventsNames",
"name": "ur",
"params": {
"appName": "ACRec",
"indexName": "urindex",
"typeName": "items",
"comment": "must have data for the first event or the model will not build, other events are optional",
"eventNames": ["purchase", "view"]
}
}
]
}
Any help is appreciated.
PIO can not find any purchase events in your data (may be they have wrong format). As result it can't build correlation between users & items.
Can you show sample events?