pandasordereddict

How to convert an ordered dict into a pandas dataframe with each key value ('label') for each column


How to convert an ordered dict into a pandas dataframe with each key ('label') for each column.

```OderedDict = [OrderedDict([('dataCells',
               [OrderedDict([('label', 'BT Ltd'),
                             ('value', '00120000000I3eZAAS')]),
                OrderedDict([('label',
                              'BP | Alert generated for Rejected FileAct message in GTE'),
                             ('value', '5007R00002otgJRQAY')]),
                OrderedDict([('label', '12/27/2020 1:57 AM'),
                             ('value', '2020-12-27T06:57:16Z')]),
                OrderedDict([('label', '52'), ('value', 52)]),
                OrderedDict([('label', 'false'), ('value', False)]),
                OrderedDict([('label', 'true'), ('value', True)]),
                OrderedDict([('label', '00120000000I3eZ'),
                             ('value', '00120000000I3eZAAS')]),
                OrderedDict([('label', 'GTE'),
                             ('value', 'GTE')]),
                OrderedDict([('label', 'FM'),
                             ('value', 'FM')]),
                OrderedDict([('label', 'Cust'), ('value', 'Cust')]),
                OrderedDict([('label', 'Incident'),
                             ('value', 'Incident')]),
                OrderedDict([('label', '07370371'),
                             ('value', '5007R00002otgJRQAY')]),
                OrderedDict([('label', '5007R00002otgJR'),
                             ('value', '5007R00002otgJRQAY')]),
                OrderedDict([('label', '00120000000I3eZ'),
                             ('value', '00120000000I3eZAAS')]),
                OrderedDict([('label', 'BT UK'),
                             ('value', 'a93w000000001NSAAY')]),
                OrderedDict([('label', 'Amanda Pandas'),
                             ('value', '0052X000008LWapQAG')])])]),```

convert to pandas data frame

df = pd.DataFrame(report_results['factMap']['T!T']['rows'] )

#use pd.Series to create columns in dataframe

df = df.dataCells.apply(pd.Series)

I have attempted the above but I end up with columns of ordered dicts ({'label': 'GTE', 'value': 'GTE'}) How would I just get the label value 'GTE'?


Solution

  • Create the data frame, expand the list then convert the dict to columns

    from collections import OrderedDict
    data = [OrderedDict([('dataCells',
                   [OrderedDict([('label', 'BT Ltd'),
                                 ('value', '00120000000I3eZAAS')]),
                    OrderedDict([('label',
                                  'BP | Alert generated for Rejected FileAct message in GTE'),
                                 ('value', '5007R00002otgJRQAY')]),
                    OrderedDict([('label', '12/27/2020 1:57 AM'),
                                 ('value', '2020-12-27T06:57:16Z')]),
                    OrderedDict([('label', '52'), ('value', 52)]),
                    OrderedDict([('label', 'false'), ('value', False)]),
                    OrderedDict([('label', 'true'), ('value', True)]),
                    OrderedDict([('label', '00120000000I3eZ'),
                                 ('value', '00120000000I3eZAAS')]),
                    OrderedDict([('label', 'GTE'),
                                 ('value', 'GTE')]),
                    OrderedDict([('label', 'FM'),
                                 ('value', 'FM')]),
                    OrderedDict([('label', 'Cust'), ('value', 'Cust')]),
                    OrderedDict([('label', 'Incident'),
                                 ('value', 'Incident')]),
                    OrderedDict([('label', '07370371'),
                                 ('value', '5007R00002otgJRQAY')]),
                    OrderedDict([('label', '5007R00002otgJR'),
                                 ('value', '5007R00002otgJRQAY')]),
                    OrderedDict([('label', '00120000000I3eZ'),
                                 ('value', '00120000000I3eZAAS')]),
                    OrderedDict([('label', 'BT UK'),
                                 ('value', 'a93w000000001NSAAY')]),
                    OrderedDict([('label', 'Amanda Pandas'),
                                 ('value', '0052X000008LWapQAG')])])])]
    
    # create DF,  expand list, convert dict to columns
    df = pd.DataFrame(data).explode("dataCells")["dataCells"].apply(lambda x: pd.Series(x))
    
    

    output

                                                        label                 value
                                                       BT Ltd    00120000000I3eZAAS
     BP | Alert generated for Rejected FileAct message in GTE    5007R00002otgJRQAY
                                           12/27/2020 1:57 AM  2020-12-27T06:57:16Z
                                                           52                    52
                                                        false                 False
                                                         true                  True
                                              00120000000I3eZ    00120000000I3eZAAS
                                                          GTE                   GTE
                                                           FM                    FM
                                                         Cust                  Cust
                                                     Incident              Incident
                                                     07370371    5007R00002otgJRQAY
                                              5007R00002otgJR    5007R00002otgJRQAY
                                              00120000000I3eZ    00120000000I3eZAAS
                                                        BT UK    a93w000000001NSAAY
                                                Amanda Pandas    0052X000008LWapQAG