scalaapache-sparkgeosparkapache-sedona

scala 2.12 + Spark 3 + sedona-sql-3.0_2.12


I'm trying to use sedona with scala and spark. Here is the build.sbt file:

ThisBuild / scalaVersion := "2.12.12"

libraryDependencies ++= Seq(
  "org.apache.spark" % "spark-core_2.12" % "3.0.1",
  "org.apache.spark" % "spark-sql_2.12" % "3.0.1",
  "org.apache.sedona" % "sedona-python-adapter-2.4_2.11" % "1.2.1-incubating",
  "org.apache.sedona" % "sedona-core-3.0_2.12" % "1.2.1-incubating",
  "org.apache.sedona" % "sedona-sql-3.0_2.12" % "1.2.1-incubating",
  "org.apache.sedona" % "sedona-viz-2.4_2.11" % "1.2.1-incubating"

)

The code is working perfectly with scala 2.11 & spark 2.4 but when I switch to spark 3 I get the following error while executing my code:

[error] Provider org.apache.spark.sql.sedona_sql.io.GeotiffFileFormat could not be instantiated


 Caused by: java.lang.NoClassDefFoundError: org/apache/spark/sql/execution/datasources/FileFormat$class
[error]         at org.apache.spark.sql.sedona_sql.io.GeotiffFileFormat.<init>(GeotiffFileFormat.scala:54)

Any thoughts?


Solution

  • According to https://sedona.apache.org/setup/maven-coordinates/#use-sedona-fat-jars, the only 3 jars you need are

    Please do not add other jars to your dependencies.

    <dependency>
      <groupId>org.apache.sedona</groupId>
      <artifactId>sedona-python-adapter-3.0_2.12</artifactId>
      <version>1.2.1-incubating</version>
    </dependency>
    <dependency>
      <groupId>org.apache.sedona</groupId>
      <artifactId>sedona-viz-3.0_2.12</artifactId>
      <version>1.2.1-incubating</version>
    </dependency>
    <!-- Optional: https://mvnrepository.com/artifact/org.datasyslab/geotools-wrapper -->
    <dependency>
        <groupId>org.datasyslab</groupId>
        <artifactId>geotools-wrapper</artifactId>
        <version>1.1.0-25.2</version>
    </dependency>