pythonpandasdataframe

How can I split/slice the values in a dataframe column and add them into a new column followed by a string?


I have a dataframe which includes a "Year" column. I am trying to use the values of that column to create the values of a new column, titled "Decade", with the output being a string.

Year Decade
1972 1970s
1983 1980s
2002 2000s
2004 2000s

I tried:

import pandas as pd

df = pd.DataFrame({'Year':[1972,1983,2002,2004]})

df['Decade'] = str(df['Year'][0:3]) + '0s'
print(df.head())

Instead, I get the following output:

   Year                                             Decade
0  1972  0    1972\n1    1983\n2    2002\nName: Year, d...
1  1983  0    1972\n1    1983\n2    2002\nName: Year, d...
2  2002  0    1972\n1    1983\n2    2002\nName: Year, d...
3  2004  0    1972\n1    1983\n2    2002\nName: Year, d...

How do I get the code to parse through the values of the cell in the column rather than the column as a whole and add the string value at the end?


Solution

  • You have to convert to string with astype, then use the str accessor:

    df['Decade'] = df['Year'].astype(str).str[0:3] + '0s'
    

    Or, probably more efficient if you have integers as input, first perform a floordiv:

    df['Decade'] = df['Year'].floordiv(10).astype(str) + '0s'
    

    Output:

       Year Decade
    0  1972  1970s
    1  1983  1980s
    2  2002  2000s
    3  2004  2000s