apache-sparkpysparkapache-spark-sql

Add month number column from timestamp column


I have a time column, with timestamps in the form 2018-04-12 06:48:39. How can I add a column Month from this timestamp, in this case containing 4?


Solution

  • pyspark.sql.functions.month:

    import pyspark.sql.functions as F
    df.withColumn('month', F.month('time')).show()
    +-------------------+-----+
    |               time|month|
    +-------------------+-----+
    |2018-04-12 06:48:39|    4|
    +-------------------+-----+