Let df_1
and df_2
be:
In [1]: import pandas as pd
...: df_1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
...: df_2 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
In [2]: df_1
Out[2]:
a b
0 1 4
1 2 5
2 3 6
We add a row r
to df_1
:
In [3]: r = pd.DataFrame({'a': ['x'], 'b': ['y']})
...: df_1 = df_1.append(r, ignore_index=True)
In [4]: df_1
Out[4]:
a b
0 1 4
1 2 5
2 3 6
3 x y
We now remove the added row from df_1
and get the original df_1
back again:
In [5]: df_1 = pd.concat([df_1, r]).drop_duplicates(keep=False)
In [6]: df_1
Out[6]:
a b
0 1 4
1 2 5
2 3 6
In [7]: df_2
Out[7]:
a b
0 1 4
1 2 5
2 3 6
While df_1
and df_2
are identical, equals()
returns False
.
In [8]: df_1.equals(df_2)
Out[8]: False
Did reseach on SO but could not find a related question.
Am I doing somthing wrong? How to get the correct result in this case?
(df_1==df_2).all().all()
returns True
but not suitable for the case where df_1
and df_2
have different length.
Use pandas.testing.assert_frame_equal(df_1, df_2, check_dtype=True)
, which will also check if the dtypes are the same.
(It will pick up in this case that your dtypes changed from int to 'object' (string) when you appended, then deleted, a string row; pandas did not automatically coerce the dtype back down to less expansive dtype.)
AssertionError: Attributes of DataFrame.iloc[:, 0] (column name="a") are different
Attribute "dtype" are different
[left]: object
[right]: int64