pythonpandasdataframeword-wrap

Break/wrap long text of column names in Pandas dataframe plain text to_string output?


Consider this example:

import pandas as pd

df = pd.DataFrame({
  "LIDSA": [0, 1, 2, 3],
  "CAE": [3, 5, 7, 9],
  "FILA": [1, 2, 3, 4], # 2 is default, so table idx 1 is default
  "VUAMA": [0.5, 1.0, 1.5, 2.0],
})
df_colnames = { # https://stackoverflow.com/q/48243818
  "LIDSA": "Lorem ipsum dolor sit amet",
  "CAE": "Consectetur adipiscing elit",
  "FILA": "Fusce imperdiet libero arcu",
  "VUAMA": "Vitae ultricies augue molestie ac",
}

# "Pandas autodetects the size of your terminal window if you set pd.options.display.width = 0" https://stackoverflow.com/q/11707586
with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.width', 0, 'max_colwidth', 20, 'display.float_format', "{:.2f}".format):
  df_str = df.rename(df_colnames,axis=1).to_string()

print(df_str)

This results with the terminal stdout printout, at the time 111 characters wide:

   Lorem ipsum dolor sit amet  Consectetur adipiscing elit  Fusce imperdiet libero arcu  Vitae ultricies augue
 molestie ac
0                           0                            3                            1
        0.50
1                           1                            5                            2
        1.00
2                           2                            7                            3
        1.50
3                           3                            9                            4
        2.00

So, only the last column got line-broken (and correspondingly, the values for it). I would have preferred that each long column name gets line-broken / word-wrapped at say 20 characters, and then the values output correspondingly, something like:

   Lorem ipsum dolor      Consectetur  Fusce imperdiet    Vitae ultricies
            sit amet  adipiscing elit      libero arcu  augue molestie ac
0                  0                3                1               0.50
1                  1                5                2               1.00
2                  2                7                3               1.50
3                  3                9                4               2.00

I thought 'max_colwidth', 20 would do that, but apparently it doesn't.

I even tried adding explicit linebreaks in the long column names, but they just get rendered as \n, and the column name is still in one line (as noted also in Linebreaks in pandas column names)

So, is it possible to "word-wrap"/"line break" long column names in Pandas for plain text string output?


Solution

  • You could use textwrap.wrap and tabulate:

    #  pip install tabulate
    from textwrap import wrap
    from tabulate import tabulate
    
    df_colnames_wrap = {k: '\n'.join(wrap(v, 20))
                        for k,v in df_colnames.items()}
    
    print(tabulate(df.rename(columns=df_colnames_wrap),
                   headers='keys', tablefmt='plain'))
    

    Output:

          Lorem ipsum dolor        Consectetur    Fusce imperdiet      Vitae ultricies
                   sit amet    adipiscing elit        libero arcu    augue molestie ac
     0                    0                  3                  1                  0.5
     1                    1                  5                  2                  1
     2                    2                  7                  3                  1.5
     3                    3                  9                  4                  2
    

    With float formatting:

    print(tabulate(df.rename(columns=df_colnames_wrap)
                     .convert_dtypes(),
                   headers='keys', tablefmt='plain',
                   floatfmt='.2f'
                  ))
    

    Output:

          Lorem ipsum dolor        Consectetur    Fusce imperdiet      Vitae ultricies
                   sit amet    adipiscing elit        libero arcu    augue molestie ac
     0                    0                  3                  1                 0.50
     1                    1                  5                  2                 1.00
     2                    2                  7                  3                 1.50
     3                    3                  9                  4                 2.00