pythonpandas

Adding a list to a dataframe whilst concating by length of DF


I have a df as follows :

ID, Week
5, 1
6, 1
7, 1

I have a list of reason codes as follows

['Work', 'Holiday', 'Sick', 'Jury'] 

What I'm trying to do is add this to my current dataframe but multiply this by each unique ID.

So I would have something like (I'll only use 1 unique ID for brevity),

ID, Week, Reason
5,  1,    'Work'
5,  1,    'Holiday'
5,  1,    'Sick',
5,  1,    'Jury'

I have tried all manner of concats, with varying axes, but I'm none-the-wiser how to approach this.


Solution

  • Let us do unnesting

    df['Reason']=[l]*len(df)
    unnesting(df,['Reason'])
    Out[1014]: 
        Reason  ID  Week
    0     Work   5     1
    0  Holiday   5     1
    0     Sick   5     1
    0     Jury   5     1
    1     Work   6     1
    1  Holiday   6     1
    1     Sick   6     1
    1     Jury   6     1
    2     Work   7     1
    2  Holiday   7     1
    2     Sick   7     1
    2     Jury   7     1
    

    # attached self-define function
    def unnesting(df, explode):
        idx=df.index.repeat(df[explode[0]].str.len())
        df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1)
        df1.index=idx
        return df1.join(df.drop(explode,1),how='left')
    

    Or we do cross join

    df.merge(pd.DataFrame({'Reason':l}).assign(Week=1))
    Out[1020]: 
        ID  Week   Reason
    0    5     1     Work
    1    5     1  Holiday
    2    5     1     Sick
    3    5     1     Jury
    4    6     1     Work
    5    6     1  Holiday
    6    6     1     Sick
    7    6     1     Jury
    8    7     1     Work
    9    7     1  Holiday
    10   7     1     Sick
    11   7     1     Jury