I'm creating a small Pandas dataframe:
df = pd.DataFrame(data={'colA': [["a", "b", "c"]]})
I take a deepcopy of that df. I'm not using the Pandas method but general Python, right?
import copy
df_copy = copy.deepcopy(df)
A df_copy.head() gives the following:
Then I put these values into a dictionary:
mydict = df_copy.to_dict()
That dictionary looks like this:
Finally, I remove one item of the list:
mydict['colA'][0].remove("b")
I'm surprized that the values in df_copy are updated. I'm very confused that the values in the original dataframe are updated too! Both dataframes look like this now:
I understand Pandas doesn't really do deepcopy, but this wasn't a Pandas method. My questions are:
1) how can I build a dictionary from a dataframe that doesn't update the dataframe?
2) how can I take a copy of a dataframe which would be completely independent?
thanks for your help!
Cheers, Nicolas
To get deepcopy:
df_copy = pd.DataFrame(
columns = df.columns, data = copy.deepcopy(df.values)
)
Notice that putting mutable objects inside a DataFrame can be an antipattern so make sure you need it and understand what you are doing.
When applied on an object, copy.deepcopy is looked up for a _deepcopy_ method of that object, that is called in turn. It's added to avoid copying too much for objects. In the case of a DataFrame instance in version 0.20.0 and above - _deepcopy_ doesn`t work recursively.
Similarly, if you will use DataFrame.copy(deep=True)
deep copy will copy the data, but will not do so recursively. .
To take a truly deep copy of a DataFrame containing a list(or other python objects), so that it will be independent - you can use one of the methods below.
df_copy = pd.DataFrame(
columns = df.columns, data = copy.deepcopy(df.values)
)
For a dictionary, you may use same trick:
mydict = pd.DataFrame(
columns = df.columns, data = copy.deepcopy(df_copy.values)
).to_dict()
mydict['colA'][0].remove("b")
There's also a standard hacky way of deep-copying python objects:
import pickle
df_copy = pickle.loads(pickle.dumps(df))
Feel free to ask for any clarifications, if needed.