scalaapache-sparkdateapache-spark-sqlazure-databricks

Map monthname to monthnumber in Azure Databricks using Scala


Need to map monthname to monthnumber in Azure Databricks using Scala. I have column name PERIOD which have data like months name(like Jan,Feb,Mar,....,Nov,Dec), I want to replace this monthname with monthnumber (like 01,02,03,...,11,12). Result should be like Jan,Feb,Mar,..,Nov,Dec replaced with 01,02,03,...,11,12.

"Jan"  -> 01,
"Feb"  -> 02,
"Mar"  -> 03,
"Apr"  -> 04,
"May"  -> 05,
"Jun"  -> 06,
"Jul"  -> 07,
"Aug"  -> 08,
"Sep"  -> 09,
"Oct"  -> 10,
"Nov"  -> 11,
"Dec"  -> 12

enter image description here


Solution

  • I have a dataframe with column called month_in_words as in the following image:

    enter image description here

    val monthNumber = Map(
          "Jan" -> "01",
          "Feb" -> "02",
          "Mar" -> "03",
          "Apr" -> "04",
          "May" -> "05",
          "Jun" -> "06",
          "Jul" -> "07",
          "Aug" -> "08",
          "Sep" -> "09",
          "Oct" -> "10",
          "Nov" -> "11",
          "Dec" -> "12"
      )
    
    var final_df = df.rdd.map(f => {
        val number = monthNumber(f.getString(0))
        (f.getString(0),number)
    }).toDF("month_in_words","month_in_number")
    //display(df)
    

    enter image description here

    var final_df = df.rdd.map(f => monthNumber(f.getString(0))).toDF("month_in_number")
    //display(final_df)
    

    enter image description here