pyspark

multiple aggregations on same column using agg in pyspark


I am not able to get multiple metrics using agg as below.

table.select("date_time")\
    .withColumn("date",to_timestamp("date_time"))\
    .agg({'date_time':'max', 'date_time':'min'}).show()

enter image description here

I see that second aggregation overwriting first aggregation, can someone help me to get multiple aggregations on same column?


Solution

  • I can't replicate and make sure that it works but I would suggest instead of using a dict for your aggregations try it like this:

    table.select("date_time")\
        .withColumn("date",to_timestamp("date_time"))\
        .agg(min('date_time'), max('date_time')).show()