scalaapache-sparkscala-ide

spark build path is cross-compiled with an incompatible version of Scala (2.11.0)


I'm observing some build errors in Scala IDE. While I know how to fix, I still don't understand how that works under the hood. I first elaborate on my case, and ask the question at the very bottom.

environment: mac

spark version 2.4.5: brew info apache-spark returns apache-spark: stable 2.4.5, HEAD From the official page (https://spark.apache.org/docs/latest/index.html) I'm seeing this: Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x). So, my understanding, that I need to choose scala of version 2.12 in the scala IDE

enter image description here

This ends up with many build errors (to save place, posting here only some of them):

enter image description here

from the errors I get the idea to try scala 2.11. It works and fixes the build errors, but I'm not satisfied as I still don't understand how that works. The jar files mentioned in the error messages are all taken from the spark 2.4.5 installation folder (/usr/local/Cellar/apache-spark/2.4.5/libexec/jars/)

So, my questions are:

  1. why choosing scala 2.11 fixes build errors given that I'm using spark 2.4.5 (which is built with 2.12 based on https://spark.apache.org/docs/latest/index.html). Why scala 2.12 doesn't work

  2. my understanding that most of the .jar files in the /usr/local/Cellar/apache-spark/2.4.5/libexec/jars are built with scala 2.11. Just based on the file names: breeze_2.11-0.13.2.jar, spark-core_2.11-2.4.5.jar, and many others. Is it expected?


Solution

  • I'll reply myself.

    the statement "For the Scala API, Spark 2.4.5 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x)." (taken from here https://spark.apache.org/docs/latest/index.html) is misleading.

    enter image description here

    here is the history of the spark releases:

    2.4.1 (https://spark.apache.org/releases/spark-release-2-4-1.html):

    In Apache Spark 2.4.1, Scala 2.12 support is GA, and it’s no longer experimental. We will drop Scala 2.11 support in Spark 3.0, so please provide us feedback. Thus spark is built with scala 2.12 starting from 2.4.1

    2.4.2 (https://spark.apache.org/releases/spark-release-2-4-1.html): Note that Scala 2.11 support is deprecated from 2.4.1 onwards. As of 2.4.2, the pre-built convenience binaries are compiled for Scala 2.12.

    however at version 2.4.3, scala is reverted back to 2.11

    2.4.3 (https://spark.apache.org/releases/spark-release-2-4-3.html): Note that 2.4.3 switched the default Scala version from Scala 2.12 to Scala 2.11, which is the default for all the previous 2.x releases except 2.4.2