I am trying to run use Intellij to build spark applications written in scala. I get the following error when I execute the scala program:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
at Main$.main(Main.scala:10)
at Main.main(Main.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
... 2 more
the line throwing the error is the following one:
val conf = new SparkConf().setAppName("Sample Spark Scala Application")
I am not getting any error if I just try to import spark SparkConf() without trying to execute the above line.
The following are the contents of my sbt file:
ThisBuild / version := "0.1.0-SNAPSHOT"
ThisBuild / scalaVersion := "2.12.15"
val sparkVersion = "3.2.4"
// Note the dependencies are provided
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % sparkVersion % "provided"
lazy val root = (project in file("."))
.settings(
name := "untitled4"
)
The versions mentioned in the sbt match with what I can see when I open the spark shell.
the following are the contents of my PATH, JAVA_HOME and SPARK_HOME variables
ray@Rayquaza-ASUS-TUF-Gaming-F15-FX506HEB-FX566HEB:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin:/usr/lib/jvm/java-8-openjdk-amd64/bin:/usr/spark/spark-3.2.4-bin-hadoop2.7/bin
ray@Rayquaza-ASUS-TUF-Gaming-F15-FX506HEB-FX566HEB:~$ echo $JAVA_HOME
/usr/lib/jvm/java-8-openjdk-amd64
ray@Rayquaza-ASUS-TUF-Gaming-F15-FX506HEB-FX566HEB:~$ echo $SPARK_HOME
/usr/spark/spark-3.2.4-bin-hadoop2.7
I am able to run code properly in the spark shell. So spark and scala seem to be properly setup. But Intellij isn't able to use those.
I tried following a few online guides but they weren't helpful. Would appreciate help solving the problem. Also let me know if any more information or details are required.
If you want to run the code from inside Intellij, the spark dependencies are not "provided". Running through Intellij with default configuration, your application is going to run as a simple JVM application and should be provided all its dependencies in the classpath. Something like the below
is happening by default when you press the Run
button in a JVM project:
java -classpath /path/to/dependency:/path/to/dependency_2 YourMain
So, my guess is that when your run the code from Intellij because your spark dependencies are marked as provided
are nowhere to be found by java
in the provided classpath
.
You should change the scope of the spark packages to compile
in your sbt file, build the project and try again.