scalaapache-sparkgenericsapache-spark-datasetapache-spark-encoders

scala generic encoder for spark case class


How can I get this method to compile. Strangely, sparks implicit are already imported.

def loadDsFromHive[T <: Product](tableName: String, spark: SparkSession): Dataset[T] = {
    import spark.implicits._
    spark.sql(s"SELECT * FROM $tableName").as[T]
  }

This is the error:

Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
[error]     spark.sql(s"SELECT * FROM $tableName").as[T]

Solution

  • According to the source code for org.apache.spark.sql.SQLImplicits, you need the type class TypeTag for your type, in order for the implicit Encoder to exist:

    import scala.reflect.runtime.universe.TypeTag
    def loadDsFromHive[T <: Product: TypeTag](tableName: String, spark: SparkSession): Dataset[T] = ...