scalaapache-spark-sqluser-defined-functionsuser-defined-aggregate

Why scala WrappedArray[Int](null,null) returns 0 when apply, what happened?


While working in a sparkSql UDAF function I find some of my input columns turns from null to 0 unexpectedly.

With some REPL practice, it turns out the behavior is of scala 2.10.5.

code snap as bellow

import scala.collection.mutable

val wa = mutable.WrappedArray.make[Int](Array(null, null))

wa

wa(1)

Would you please someone family with scala help explain why and what is happening behind the hood?


Solution

  • You called method make[Int] which is declared as follows:

    def make[T](x: AnyRef): WrappedArray[T] = (x match {
        case null              => null
        case x: Array[AnyRef]  => new ofRef[AnyRef](x)
        case x: Array[Int]     => new ofInt(x)
        case x: Array[Double]  => new ofDouble(x)
        case x: Array[Long]    => new ofLong(x)
        case x: Array[Float]   => new ofFloat(x)
        case x: Array[Char]    => new ofChar(x)
        case x: Array[Byte]    => new ofByte(x)
        case x: Array[Short]   => new ofShort(x)
        case x: Array[Boolean] => new ofBoolean(x)
        case x: Array[Unit]    => new ofUnit(x)
      }).asInstanceOf[WrappedArray[T]]
    

    In your case x is Array(null, null) which is instance of Array[AnyRef], so make creates and returns instance of class ofRef[AnyRef] which is declared as:

    final class ofRef[T <: AnyRef](val array: Array[T]) extends WrappedArray[T] with Serializable {
      lazy val elemTag = ClassTag[T](arrayElementClass(array.getClass))
      def length: Int = array.length
      def apply(index: Int): T = array(index).asInstanceOf[T]
      def update(index: Int, elem: T) { array(index) = elem }
    }
    

    When you call wa(1), you call method apply of this class and since your second element is null it will return 0, because null.asInstanceOf[Int] returns 0.