pythonpandasrecommendation-engine

How to convert "frozenset({})" string to the data type frozenset?


I am trying to learn recommendation systems. I've imported associative rules to my sheet but antecedents and consequents values are formatted as strings I need to convert them to the data type frozenset in python. If a have a string like "frozenset({3048, 3046})" I need to convert it to (3048,3046) How can I do that?

Here is the sample code.

import pandas as pd

frozen_df =  [{"antecedents" : "frozenset({3048, 3046})","consequents" : "frozenset({10})"},
              {"antecedents" : "frozenset({3504, 3507})","consequents" : "frozenset({3048, 85})"}]

frozen_df = pd.DataFrame(frozen_df)
frozen_df.dtypes

Solution

  • Edit: Since this question appears to remain alive, here's a proper version, supporting alternative string formats:

    import re
    
    def to_frozenset(x):
        return frozenset(map(int, re.findall("\d+", str(x))))
    

    You sure can split them up, e.g. using

    def to_frozenset(x):
        return frozenset(map(int, x.split("{")[1].split("}")[0].split(",")))
    
    frozen_df = frozen_df.applymap(to_frozenset)
    

    Note, however, that frozen_df.dtypes will still be object, since there is not "frozenset dtype" in Pandas. Instead, looking at a single element (frozen_df.iloc[0, 0]) will demonstrate that the elements are indeed frozensets.