scalaapache-sparkdataframeapache-spark-sql

How to retrieve the month from a date column values in scala dataframe?


Given:

val df = Seq((1L, "04-04-2015")).toDF("id", "date")
val df2 = df.withColumn("month", from_unixtime(unix_timestamp($"date", "dd/MM/yy"), "MMMMM"))
df2.show()

I got this output:

+---+----------+-----+
| id|      date|month|
+---+----------+-----+
|  1|04-04-2015| null|
+---+----------+-----+

However, I want the output to be as below:

+---+----------+-----+
| id|      date|month|
+---+----------+-----+
|  1|04-04-2015|April|
+---+----------+-----+

How can I do that in sparkSQL using Scala?


Solution

  • This should do it:

    val df2 = df.withColumn("month", date_format(to_date($"date", "dd-MM-yyyy"), "MMMM"))
    
    df2.show
    +---+----------+-----+
    | id|      date|month|
    +---+----------+-----+
    |  1|04-04-2015|April|
    +---+----------+-----+
    

    NOTE:

    Docs: