pythonselecttypescastingpyspark

How to change multiple columns' types in pyspark?


I am just studying pyspark. I want to change the column types like this:

df1=df.select(df.Date.cast('double'),df.Time.cast('double'),
          df.NetValue.cast('double'),df.Units.cast('double'))

You can see that df is a data frame and I select 4 columns and change all of them to double. Because of using select, all other columns are ignored.

But, if df has hundreds of columns and I just need to change those 4 columns. I need to keep all the columns. So, how to do it?


Solution

  • for c in df.columns:
        # add condition for the cols to be type cast
        df=df.withColumn(c, df[c].cast('double'))