I am trying to teach myself Scala and am using IntelliJ IDEA as my IDE. I have launched IntelliJ's sbt shell, run console
and then entered the following:
import org.apache.spark.SparkConf
import org.apache.spark.sql.{DataFrame, SparkSession}
import java.time.LocalDate
object DataFrameExtensions {
implicit class DataFrameExtensions(df: DataFrame){
def featuresGroup1(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
def featuresGroup2(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
}
}
import DataFrameExtensions._
val spark = SparkSession.builder().config(new SparkConf().setMaster("local[*]")).enableHiveSupport().getOrCreate()
import spark.implicits._
val df = Seq((8, "bat"),(64, "mouse"),(-27, "horse")).toDF("number", "word")
val groupBy = Seq("a","b")
val asAt = LocalDate.now()
val dataFrames = Seq(df.featuresGroup1(groupBy, asAt),df.featuresGroup2(groupBy, asAt))
It fails on the last line with:
scala> val dataFrames = Seq(df.featuresGroup1(groupBy, asAt),df.featuresGroup2(groupBy, asAt))
<console>:25: error: value featuresGroup1 is not a member of
org.apache.spark.sql.DataFrame
val dataFrames = Seq(df.featuresGroup1(groupBy, asAt),df.featuresGroup2(groupBy, asAt))
^
<console>:25: error: value featuresGroup2 is not a member of org.apache.spark.sql.DataFrame
val dataFrames = Seq(df.featuresGroup1(groupBy, asAt),df.featuresGroup2(groupBy, asAt))
^
I've copied the code pretty much verbatim from elsewhere (where I know it works) so I don't know why this isn't working. Why are the functions defined in my implicit class not available as functions on a DataFrame
?
It seems you need to rename the implicit class DataFrameExtensions
to be a different name because there is an object
with the same name. I guess the compiler got confused to locate the implicit class
when you use
import DataFrameExtensions._
I rename it to below and it works now
implicit class FeatureGroupImplicits(df: DataFrame){
def featuresGroup1(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
def featuresGroup2(groupBy: Seq[String], asAt: LocalDate): DataFrame = {df}
}