I have a project with multiple scala spark programs, while I run mvn install through eclipse I am able to get the correct jar generated which is used with spark-submit command to run.
After pushing the code to GIT we are trying to build it using jenkins, as we want to automatically push the jar file to our hadoop cluster using ansible. We have jenkinsfile with build goals as "compile package install -X".
The logs show that-
[DEBUG](f)artifact = com.esi.rxhome:PROJECT1:jar:0.0.1-SNAPSHOT
[DEBUG](f) attachedArtifacts = [com.esi.rxhome:PROJECT1:jar:jar- with-dependencies:0.0.1-SNAPSHOT, com.esi.rxhome:PROJECT1:jar:jar-with-dependencies:0.0.1-SNAPSHOT]
[DEBUG] (f) createChecksum = false
[DEBUG] (f) localRepository = id: local
url: file:///home/jenkins/.m2/repository/
layout: default
snapshots: [enabled => true, update => always]
releases: [enabled => true, update => always]
[DEBUG] (f) packaging = jar
[DEBUG] (f) pomFile = /opt/jenkins-data/workspace/ng_datahub-pipeline_develop-JYTJLDEXV65VZWDCZAXG5Y7SHBG2534GFEF3OF2WC4543G6ANZYA/pom.xml
[DEBUG] (s) skip = false
[DEBUG] (f) updateReleaseInfo = false
[DEBUG] -- end configuration --
[INFO] **No primary artifact to install, installing attached artifacts instead**
I saw the error in the similar post -
Maven: No primary artifact to install, installing attached artifacts instead
But here the answer says - Remove auto clean, I am not sure how to stop that while jenkins is building the jar file.
Below is the pom.xml-
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.esi.rxhome</groupId>
<artifactId>PROJECT1</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>${project.artifactId}</name>
<description>RxHomePreprocessing</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.10.6</scala.version>
<scala.compat.version>2.10</scala.compat.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-core_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compat.version}</artifactId>
<version>2.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.2.1000.2.6.0.3-8</version>
</dependency>
<!-- <dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.0</version>
</dependency> -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.5.0</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-make:transitive</arg>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<!-- If you have classpath issue like NoDefClassError,... -->
<!-- useManifestOnlyJar>false</useManifestOnlyJar -->
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<configuration>
<skipIfEmpty>true</skipIfEmpty>
</configuration>
<executions>
<execution>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>com.esi.spark.storedprocedure.Test_jdbc_nospark</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-clean-plugin</artifactId>
<version>3.0.0</version>
<configuration>
<skip>true</skip>
</configuration>
</plugin>
</plugins>
</build>
</project>
I tried specifying
1 -" jar " for packaging in pom.xml.
2 -changing the maven goals to -
install
clean install
compile package install
But above tries did not help get rid of the message and jar created was of no use.
When I try to execute the spark submit command-
spark-submit --driver-java-options -Djava.io.tmpdir=/home/EH2524/tmp --conf spark.local.dir=/home/EH2524/tmp --driver-memory 2G --executor-memory 2G --total-executor-cores 1 --num-executors 10 --executor-cores 10 --class com.esi.spark.storedprocedure.Test_jdbc_nospark --master yarn /home/EH2524/PROJECT1-0.0.1-20171124.213717-1-jar-with-dependencies.jar
Multiple versions of Spark are installed but SPARK_MAJOR_VERSION is not set
Spark1 will be picked by default
java.lang.ClassNotFoundException: com.esi.spark.storedprocedure.Test_jdbc_nospark
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:703)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Here, Test_jdbc_nospark is a scala object.
This message was because of the maven-jar-plugin having the as true. Once i removed this, the build is not giving the message "No primary artifact to install, installing attached artifacts instead"
The empty jar was getting created because of incorrect path in pom.xml
Intitally-
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
As jenkins was building through code in git and the pom was inside the project folder.
<build>
<sourceDirectory>folder_name_in_git/src/main/scala</sourceDirectory>