pandasstringio

how to convert the string variable which is from pandas to_string, back to dataframe


i get a string var (reff) from pandas to_string

the reff string is firstly seperated by \n. then for each row, the seperator consists of various spaces. and as you can see in the screenshot, each column is aligned on the right. how to convert it back to pandas dataframe ? thanks.

enter image description here

reff_split = ['                                       Name  Average Cost                                           Cuisines  Aggregate Rating    City',
 '                                Thai Garden            13                           Cafe, American, Desserts               4.2 Abilene',
 '                               Crispy Crust            54          Desserts, Bakery, Cafe, American, Seafood               3.4 Abilene',
 '                             Finger Licious            35                      Fast Food, Cafe, BBQ, Seafood               0.0 Abilene',
 '                                    Mx Corn            62                                 Tea, Cafe, Italian               3.1 Abilene',
 '                             LPK Waterfront            61            Pizza, Italian, BBQ, Fast Food, Seafood               3.7 Abilene',
 '                               Cakes Degree            18                       Cafe, Mexican, BBQ, Desserts               3.4 Abilene',
 '                             Chateau Garlic            42                 Pizza, Bakery, Desserts, Fast Food               3.0 Abilene',
 '                             Mediumwelldone            91                    Tea, French, BBQ, Cafe, Seafood               3.5 Abilene',
 '                            Biryani Express            11              Tea, Pizza, French, Bakery, Fast Food               0.0 Abilene',
 '                                     Subway            57                    Fast Food, Tea, Bakery, Italian               3.3 Abilene',
 '                       The Grand Trunk Road            80    Tea, Pizza, Bakery, BBQ, Chinese, Mediterranean               3.9 Abilene',
 "                                Bhikharam's            89                            Bakery, Pizza, Desserts               3.0 Abilene",
 '                                Pawan Foods            74                      Tea, Cafe, American, Desserts               0.0 Abilene',
 '                               Dilli Darbar            40                        Pizza, Bakery, BBQ, Seafood               2.3 Abilene',
 '                             Piyu Fast Food            88                   Mexican, Indian, Desserts, Pizza               0.0 Abilene',
 '                                Madras Cafe            54        Tea, Pizza, American, Cafe, Indian, Seafood               0.0 Abilene',
 '                                       Druk            79                       Cafe, Bakery, Pizza, Seafood               4.1 Abilene',
 '                                    Barista            88                  Fast Food, Mexican, Bakery, Pizza               3.3 Abilene',
 '                                   Mamagoto            93 Desserts, Tea, Bakery, Cafe, Indian, Mediterranean               4.1 Abilene',
 '                               Sindhi Kulfi            37                Pizza, French, BBQ, Fast Food, Cafe               3.0 Abilene',
 '                    Aggarwal Sweet & Bakers            79                          Pizza, Mediterranean, BBQ               2.9 Abilene',
 '                              Lotus Kitchen            15                                French, Bakery, BBQ               0.0 Abilene',
 '                            Inam Muradabadi            35             Tea, Seafood, Mediterranean, Fast Food               0.0 Abilene',
 "                             Domino's Pizza            89                   Bakery, Pizza, American, Seafood               0.0 Abilene",
 '                         Linx - Premier Inn            44                                 Tea, Bakery, Pizza               3.0 Abilene',
 '                              Scrummy Bites            69                          Indian, Desserts, Seafood               3.1 Abilene',
 'Vanshika Indian, Chinese, & Parantha Corner            60                                   Pizza, Fast Food               3.1 Abilene',
 '                               Otik Hotshop            82              Desserts, Tea, BBQ, Fast Food, Indian               3.1 Abilene',
 '                               Gelato Vinto            12                Seafood, Mexican, Indian, Fast Food               3.1 Abilene',
 "                                   Tomato's            21                             Tea, Cafe, Bakery, BBQ               4.1 Abilene"]

Solution

  • I doubt the solutions provided in the linked duplicate (including the nested one) can help you bring the dataframe back. In your context, one solution would be to compute the widths of each column in the header (first element/line) and pass it to read_fwf :

    import io, re
    
    header = re.findall(r".+?\S(?=\s\s+|$)", reff_split[0])
    
    df = pd.read_fwf(io.StringIO("\n".join(reff_split)), widths=map(len, header))
    

    NB: The regex pattern uses a lookahead because the columns are right-aligned.

    Output :

                  Name  Average Cost        Cuisines  Aggregate Rating     City
    0      Thai Garden            13  Cafe, Ameri...               4.2  Abilene
    1     Crispy Crust            54  Desserts, B...               3.4  Abilene
    2   Finger Licious            35  Fast Food, ...               0.0  Abilene
    3          Mx Corn            62  Tea, Cafe, ...               3.1  Abilene
    4   LPK Waterfront            61  Pizza, Ital...               3.7  Abilene
    ..             ...           ...             ...               ...      ...
    25   Scrummy Bites            69  Indian, Des...               3.1  Abilene
    26  Vanshika In...            60  Pizza, Fast...               3.1  Abilene
    27    Otik Hotshop            82  Desserts, T...               3.1  Abilene
    28    Gelato Vinto            12  Seafood, Me...               3.1  Abilene
    29        Tomato's            21  Tea, Cafe, ...               4.1  Abilene
    
    [30 rows x 5 columns]