apache-sparkpyspark

Difference between alias and withColumnRenamed


What is the difference between:

my_df = my_df.select(col('age').alias('age2'))

and

my_df = my_df.select(col('age').withColumnRenamed('age', 'age2'))

Solution

  • The second expression is not going to work, you need to call withColumnRenamed() on your dataframe. I assume you mean:

    my_df = my_df.withColumnRenamed('age', 'age2')
    

    And to answer your question, there is no difference.