I am trying to learn recommendation systems. I've imported associative rules to my sheet but antecedents and consequents values are formatted as strings I need to convert them to the data type frozenset in python.
If a have a string like "frozenset({3048, 3046})"
I need to convert it to (3048,3046)
How can I do that?
Here is the sample code.
import pandas as pd
frozen_df = [{"antecedents" : "frozenset({3048, 3046})","consequents" : "frozenset({10})"},
{"antecedents" : "frozenset({3504, 3507})","consequents" : "frozenset({3048, 85})"}]
frozen_df = pd.DataFrame(frozen_df)
frozen_df.dtypes
Edit: Since this question appears to remain alive, here's a proper version, supporting alternative string formats:
import re
def to_frozenset(x):
return frozenset(map(int, re.findall("\d+", str(x))))
You sure can split them up, e.g. using
def to_frozenset(x):
return frozenset(map(int, x.split("{")[1].split("}")[0].split(",")))
frozen_df = frozen_df.applymap(to_frozenset)
Note, however, that frozen_df.dtypes
will still be object
, since there is not "frozenset dtype" in Pandas. Instead, looking at a single element (frozen_df.iloc[0, 0]
) will demonstrate that the elements are indeed frozensets.