scalaapache-sparkapache-spark-sql

filter only not empty arrays dataframe spark


How can i filter only not empty arrays

import  org.apache.spark.sql.types.ArrayType

  val arrayFields = secondDF.schema.filter(st => st.dataType.isInstanceOf[ArrayType])
  val names = arrayFields.map(_.name)

Or is this code

val DF1=DF.select(col("key"),explode(col("ob")).as("collection")).select(col("collection.*"),col("key"))

I get this error

 org.apache.spark.sql.AnalysisException: Can only star expand struct data types. Attribute: ArrayBuffer(collection);

Any help is appreciated.


Solution

  • Use the function size

    import org.apache.spark.sql.functions._
    
    secondDF.filter(size($"objectiveAttachment") > 0)