I have a spark Job that is deployed using k8s and it is of version 3.3.2 Recently there were some vulneabilities in spark 3.3.2
I changed my dockerfile to download 3.4.0 instead of 3.3.2 and also my application jar is built on spark 3.4.0
However while deploying, I get this error
Exception in thread "main" java.nio.file.NoSuchFileException: <path>/spark-assembly-1.0.jar
where "spark-assembly-1.0.jar" is the jar which contain my spark job.
I have this in deployment.yaml of the app
mainApplicationFile: "local:///<path>/spark-assembly-1.0.jar"
and I have not changed anything related to that. I see that some code has changed in spark 3.4.0 core's source code regarding jar location.
Has it really changed the functionality ? Is there anyone who is facing same issue as me ? Should the path be specified in a different way.
I hit this same issue. I believe the behaviour change was introduced in:
SPARK-43540 - Add working directory into classpath on the driver in K8S cluster mode
Our docker file was overriding the working directory of the base spark image
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
WORKDIR /spark-jars
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
Changing it to this solved the problem:
FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
USER root
RUN mkdir /spark-jars
USER spark
COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
Hope this helps!