What is the difference between sort and orderBy spark DataFrame?
scala> zips.printSchema
root
|-- _id: string (nullable = true)
|-- city: string (nullable = true)
|-- loc: array (nullable = true)
| |-- element: double (containsNull = true)
|-- pop: long (nullable = true)
|-- state: string (nullable = true)
Below commands produce same result:
zips.sort(desc("pop")).show
zips.orderBy(desc("pop")).show
OrderBy is just an alias for the sort function.
From the Spark documentation:
/**
* Returns a new Dataset sorted by the given expressions.
* This is an alias of the `sort` function.
*
* @group typedrel
* @since 2.0.0
*/
@scala.annotation.varargs
def orderBy(sortCol: String, sortCols: String*): Dataset[T] = sort(sortCol, sortCols : _*)