pandas

Copying columns in pandas


I want to compare two or more columns and place the larger and smaller values ​​in a new column.

First, I copied the two columns and then changed the values ​​of the rows that met the conditions.

However, there is a problem where row values ​​that do not meet the conditions are changed to nan.

  1. How not to change to nan
  2. How to do this all at once (if possible)

sample:

import pandas as pd

a = [
    {'a': 5, 'b': 10, 'c': 5},
    {'a': 4, 'b': 0.5, 'c': 1},
    {'a': 3.5, 'b': 15, 'c': 9},
    {'a': 2.1, 'b': 5, 'c': 8},
    {'a': 0.1, 'b': 1, 'c': 5},
]

df = pd.DataFrame(a)
df.loc[:, ['Big', 'Small']] = df[['a', 'b']].values
print(df.head())
df.loc[df['a'] < df['b'], ['Big', 'Small']] = df[['b', 'a']]
print(df.head())
>>
     a     b  c  Big  Small
0  5.0  10.0  5  5.0   10.0
1  4.0   0.5  1  4.0    0.5
2  3.5  15.0  9  3.5   15.0
3  2.1   5.0  8  2.1    5.0
4  0.1   1.0  5  0.1    1.0

     a     b  c  Big  Small
0  5.0  10.0  5  NaN    NaN
1  4.0   0.5  1  4.0    0.5
2  3.5  15.0  9  NaN    NaN
3  2.1   5.0  8  NaN    NaN
4  0.1   1.0  5  NaN    NaN

i want:

>>
      a     b  c  Big  Small
*0  5.0  10.0  5 10.0    5.0*
 1  4.0   0.5  1  4.0    0.5
*2  3.5  15.0  9 15.0    3.5*
*3  2.1   5.0  8  5.0    2.1*
*4  0.1   1.0  5  1.0    0.1*

Solution

  • You can use numpy to compare

    df['Big'] = np.where(df['a'] >= df['b'], df['a'], df['b'])
    df['Small'] = np.where(df['a'] < df['b'], df['a'], df['b'])
    

    Output

         a     b  c   Big  Small
    0  5.0  10.0  5  10.0    5.0
    1  4.0   0.5  1   4.0    0.5
    2  3.5  15.0  9  15.0    3.5
    3  2.1   5.0  8   5.0    2.1
    4  0.1   1.0  5   1.0    0.1