Running a Spark SQL (v2.1.0_2.11) program in Java immediately fails with the following exception, as soon as the first action is called on a dataframe:
java.lang.ClassNotFoundException: org.codehaus.commons.compiler.UncheckedCompileException
I ran it in Eclipse, outside of the spark-submit
environment. I use the following Spark SQL Maven dependency:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.0</version>
<scope>provided</scope>
</dependency>
The culprit is the library commons-compiler
. Here is the conflict:
To work around this, add the following to your pom.xml:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>2.7.8</version>
</dependency>
</dependencies>
</dependencyManagement>