pythonpandasrenamedataframe

Convert row to column header for Pandas DataFrame


The data I have to work with is a bit messy.. It has header names inside of its data. How can I choose a row from an existing pandas dataframe and make it (rename it to) a column header?

I want to do something like:

header = df[df['old_header_name1'] == 'new_header_name1']

df.columns = header

Solution

  • In [21]: df = pd.DataFrame([(1,2,3), ('foo','bar','baz'), (4,5,6)])
    
    In [22]: df
    Out[22]: 
         0    1    2
    0    1    2    3
    1  foo  bar  baz
    2    4    5    6
    

    Set the column labels to equal the values in the 2nd row (index location 1):

    In [23]: df.columns = df.iloc[1]
    

    If the index has unique labels, you can drop the 2nd row using:

    In [24]: df.drop(df.index[1])
    Out[24]: 
    1 foo bar baz
    0   1   2   3
    2   4   5   6
    

    If the index is not unique, you could use:

    In [133]: df.iloc[pd.RangeIndex(len(df)).drop(1)]
    Out[133]: 
    1 foo bar baz
    0   1   2   3
    2   4   5   6
    

    Using df.drop(df.index[1]) removes all rows with the same label as the second row. Because non-unique indexes can lead to stumbling blocks (or potential bugs) like this, it's often better to take care that the index is unique (even though Pandas does not require it).