apache-sparkapache-spark-sqlhdpspark-shell

Array_max spark.sql.function not found


I need to use the function array_max and array_min from the package org.apache.spark.sql.functions._ but both functions are not found?

scala> import org.apache.spark.sql.functions._
     import org.apache.spark.sql.functions._
scala>... array_max(col(..))
error: not found: value array_max

ps :

  1. scala version 2.11.8
  2. spark version 2.3.0.2.6.5.0-292
  3. HDP 2.6.5

Solution

  • def array_max(e: org.apache.spark.sql.Column): org.apache.spark.sql.Column

    This array_max function is not available in Spark 2.3 & it is only available from Spark 2.4 onwards.

    For reference please check below git repos.

    Spark 2.3

    Spark 2.4