mongodbscalaapache-sparkcasbah

com/mongodb/casbah/Imports$ ClassNotFound running spark-submit and Mongo


Im having a issue when try to run a jar using spark-submit. This is my sbt file:

name := "Reading From Mongo Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.mongodb" %% "casbah" % "2.5.0"

Im using

sbt package

to create jar file. And all looks good. Then, I executed it this way:

spark-submit --class "ReadingFromMongo" --master local /home/bigdata1/ReadingFromMongoScala/target/scala-2.10/reading-from-mongo-project_2.10-1.0.jar

And got this error:

Error: application failed with exception
java.lang.NoClassDefFoundError: com/mongodb/casbah/Imports$
        at ReadingFromMongo$.main(ReadingFromMongo.scala:6)
        at ReadingFromMongo.main(ReadingFromMongo.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:577)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:174)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:197)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.mongodb.casbah.Imports$
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 11 more

My ReadingFromMongo class is this one:

import com.mongodb.casbah.Imports._

object ReadingFromMongo {
    def main(args: Array[String]) {
        val mongoClient = MongoClient("mongocluster", 27017)
        val db = mongoClient("Grupo12")
        val coll = db("test")
        println("\n\Total: "+coll.count()+"\n")
    }
}

I dont know why is this happening. This is the first time Im facing this kind of problem.

Hope someone can help me.

Thanks a lot.


Solution

  • sbt package creates jar with your code, excluding dependencies. So, spark does not know where to take mongo dependencies. You need either: include mongo and other required dependencies into classpath, or build "fat jar" that will include deps classes. sbt-assembly plugin help you if you choose second approach.