pythonpandasdataframe

Sorting a column of max values from a multicolumn pandas data frame


I have a multi-column pandas data frame of years and corresponding cumulative rainfall values from 1 to 183 (October to March). That means in each column the last value is the maximum column value, being the cumulative total from 1. I then create a new data frame which contains the years (col1) and the max values in col2. I would like to sort the max values in descending order. What I have tried does not work and I do not find an example which is similar to my problem. First, I reproduce the original data frame. Since the years and the days are too many, I limit this to 10 years plus the mean, and the days from 1st-31st January.

  1. Original data frame

    Avge 1945 1946 1947 1948 1949 1950 1951 1952 1953 1954 313.9 246.1 427.0 303.3 282.5 400.4 177.5 519.0 205.3 284.2 373.8 321.7 246.1 427.0 304.3 282.5 405.7 178.0 539.3 205.3 285.7 373.8 329.8 246.1 462.1 310.6 282.5 408.7 178.8 588.6 207.8 297.6 373.8 338.2 275.8 471.5 311.6 282.5 410.7 184.9 597.5 207.8 297.6 373.8 346.3 279.1 495.6 329.4 282.5 424.2 206.0 602.3 276.4 299.9 388.8 352.2 279.1 542.6 353.5 283.5 435.6 211.3 602.3 286.8 303.7 391.1 359.0 305.5 542.6 368.2 283.8 435.6 211.6 609.9 295.4 323.8 398.7 362.6 305.5 542.6 372.5 283.8 445.3 211.6 609.9 314.2 328.4 398.7 367.8 305.5 552.8 395.1 283.8 445.8 223.5 620.1 354.1 328.7 398.7 373.6 314.9 554.3 395.1 283.8 448.8 223.8 659.5 354.1 328.7 428.7 381.4 314.9 554.3 430.4 283.8 464.0 230.4 660.5 361.5 329.2 436.1 391.4 314.9 554.3 441.1 283.8 514.3 244.9 660.5 392.7 329.2 574.8 399.1 314.9 554.6 445.9 283.8 516.1 246.4 660.5 401.1 330.0 587.5 407.5 324.6 556.1 447.2 283.8 530.1 254.8 663.0 431.6 346.8 597.4 415.5 325.9 556.1 449.2 283.8 532.9 259.9 663.0 458.0 359.5 610.9 424.6 330.0 556.1 452.5 283.8 533.9 260.9 669.3 468.7 361.3 611.4 433.6 335.6 556.1 452.5 283.8 540.5 263.7 669.3 487.0 363.6 665.2 444.5 336.6 564.0 452.5 283.8 559.5 263.7 689.6 507.3 372.2 665.2 452.6 336.6 584.3 453.5 283.8 571.4 270.8 689.6 509.8 377.0 689.3 463.2 352.6 584.3 453.5 317.3 571.4 270.8 742.7 509.8 377.0 697.4 472.3 381.6 584.3 453.5 317.3 591.0 288.3 751.8 518.7 409.3 697.4 482.2 381.9 584.3 455.8 317.6 610.0 294.6 751.8 518.7 409.3 697.4 486.9 386.7 585.8 455.8 317.6 620.2 321.3 767.8 518.7 413.6 703.7 491.4 388.7 589.6 455.8 317.6 620.2 341.1 767.8 522.3 432.9 703.7 499.8 388.7 591.9 455.8 320.6 620.2 380.5 790.9 522.3 442.8 705.0 504.2 388.7 619.3 455.8 320.6 642.8 381.5 790.9 529.7 442.8 708.3 512.7 388.7 620.1 477.9 354.6 692.8 381.8 795.0 529.7 442.8 708.3 520.3 389.2 630.5 477.9 355.4 708.5 381.8 810.2 532.2 452.2 722.5 527.9 390.2 641.9 477.9 358.7 710.3 381.8 810.2 532.2 452.2 723.0 536.3 391.2 647.0 505.3 361.5 710.3 383.8 812.7 548.5 452.2 782.7 542.7 406.7 654.9 505.3 379.5 713.6 387.4 837.6 552.6 455.8 786.0

The code

import pandas as pd
import os

df = pd.read_csv('myfile.txt', sep=' ', skipinitialspace=True)
new_df = df.max()
print(new_df)

gives

Avge    542.7
1945    406.7
1946    654.9
1947    505.3
1948    379.5
1949    713.6
1950    387.4
1951    837.6
1952    552.6
1953    455.8
1954    786.0

Sorting the second column with

new_df_sorted_cols = new_df.sort_index(axis=0)

gives the same order as above, except Avge comes to the bottom. Putiing axis=1 gives error

ValueError: No axis named 1 for object type Series

Help will be appreciated.


Solution

  • Try to use this, it sorts the Series by values in descending order.

    new_df_sorted = new_df.sort_values(ascending=False)