pythonpandasdataframe

Getting hierachy from 2 columns that have parent child relationships


I have a dataframe like so :

data = {
    'Parent': [None, None,  'A',    'B',    'C',    'I',    'D',    'F',    'G',    'H',    'Z',    'Y',    None,None,None,None,    'AA',   'BB',   'CC',   'EE',   'FF',   None,   None],
    'Child': ['A',  'B',    'D',    'D',    'D',    'C',    'E',    'E',    'F',    'F',    'G',    'H',    'Z',    'Y',    'AA',   'BB',   'CC',   'CC',   'DD',   'DD',   'DD',   'EE',   'FF']
}

df = pd.DataFrame(data)
        
   Parent Child
0    None     A
1    None     B
2       A     D
3       B     D
4       C     D
5       I     C
6       D     E
7       F     E
8       G     F
9       H     F
10      Z     G
11      Y     H
12   None     Z
13   None     Y
14   None    AA
15   None    BB
16     AA    CC
17     BB    CC
18     CC    DD
19     EE    DD
20     FF    DD
21   None    EE
22   None    FF

I want an output dataframe like so:

Expected Output

I tried using the networkx package as suggested in this post, This is the code I used

df['parent']=df['parent'].fillna('No Parent')

leaves =set(df['parent']).difference(df['child'])
g= nx.from_pandas_edgelist(df, 'parent', 'child', create_using=nx.DiGraph())
ancestors = {
    n: nx.algorithms.dag.ancestors(g, n) for n in leaves
}

df1=(pd.DataFrame.from_dict(ancestors, orient='index')
 .rename(lambda x: 'parent_{}'.format(x+1), axis=1)
 .rename_axis('child')
 .fillna('')
 )

But I get an empty dataframe. Is there an elegant way to achieve this?


Solution

  • One of the options is to make the final DataFrame from_dict of the predecessors :

    DG = nx.from_pandas_edgelist(
        df.fillna("#"), "parent", "child", create_using=nx.DiGraph
    )
    
    DG.remove_node("#") # remove the placeholder
    
    out = (
        pd.DataFrame.from_dict(
            {n: DG.predecessors(n) for n in DG}, orient="index"
        ).rename(columns=lambda c: f"Parent {c+1}").reset_index(names="Child")
    )
    
    # {"Parent 1": "deepskyblue", "Parent 2": "lightcoral", "Parent 3": "springgreen"}
    

    enter image description here