scalaapache-sparkspark-operator

Upgrade to spark 3.4.0 from 3.3.2 gives Exception in thread "main" java.nio.file.NoSuchFileException: , although jar is present in the location


I have a spark Job that is deployed using k8s and it is of version 3.3.2 Recently there were some vulneabilities in spark 3.3.2

I changed my dockerfile to download 3.4.0 instead of 3.3.2 and also my application jar is built on spark 3.4.0

However while deploying, I get this error

Exception in thread "main" java.nio.file.NoSuchFileException: <path>/spark-assembly-1.0.jar

where "spark-assembly-1.0.jar" is the jar which contain my spark job.

I have this in deployment.yaml of the app

 mainApplicationFile: "local:///<path>/spark-assembly-1.0.jar"

and I have not changed anything related to that. I see that some code has changed in spark 3.4.0 core's source code regarding jar location.

Has it really changed the functionality ? Is there anyone who is facing same issue as me ? Should the path be specified in a different way.


Solution

  • I hit this same issue. I believe the behaviour change was introduced in:

    SPARK-43540 - Add working directory into classpath on the driver in K8S cluster mode

    Our docker file was overriding the working directory of the base spark image

    FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
    WORKDIR /spark-jars
    COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
    

    Changing it to this solved the problem:

    FROM apache/spark:3.4.1@sha256:a1dd2487a97fb5e35c5a5b409e830b501a92919029c62f9a559b13c4f5c50f63 as image
    USER root
    RUN mkdir /spark-jars
    USER spark
    COPY --from=build /...../target/scala-2.12/my-spark.jar /spark-jars/
    

    Hope this helps!