I recently updated to Spark 2.4.3
and Scala 2.12.3
(from Spark 2.0.0), and I having issues compiling very simple code (load and show).
My build.sbt with sbt 1.2.8 is:
name := "my-program"
version := "1.0"
scalaVersion := "2.12"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.3"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.3"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.4.3"
I am developing with Scala for Eclipse, and the jars spark-core_2.12_2.4.3.jar
, spark-mllib_2.12_2.4.3.jar
, and spark-sql_2.12_2.4.3.jar
are linked to my build path (Eclipse shows no error).
I updated Spark, scala, and sbt with homebrew. I don't know if that messes up with how sbt finds the jars?
I tried sbt clean package and sbt package many times, but all I get is:
[error] /Users/me/myproject/src/main/scala/Analysis.scala:5:12: object apache is not a member of package org
[error] import org.apache.spark.sql.SparkSession
I am out of ideas of what to try.
Analysis.scala:
package main.scala
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
object Analysis {
def main(args: Array[String]) {
// Start Spark session
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._
// Reduce verbosity of output when running in console
spark.sparkContext.setLogLevel("WARN")
val df = spark.read
// Format read
.format("com.databricks.spark.csv")
.option("header", "true")
.option("parserLib", "UNIVOCITY")
.option("inferSchema", "true")
.json("data/transaction.txt")
df.printSchema()
}
}
The build.sbt was not in the project folder (/Users/me/myproject/), but in the src folder (/Users/me/myproject/src) so when I did sbt package, it never found my build.sbt, and therefore had dependencies problem.