This drives me nuts. When I searched for tips about dropping elements in a dataframe there was nothing about mixed typed series.
Say here is a dataframe:
import pandas as pd
df = pd.DataFrame(data={'col1': [1,2,3,4,'apple','apple'], 'col2': [3,4,5,6,7,8]})
a = df['col1']
Then 'a' is a mixed typed series with 6 components. How can I remove all 'apple's from a? I need series = 1,2,3,4.
Approach: filter rows with numeric values to keep (instead of converting non-numeric values to NaN
then drop NaN
). The difference is that we won't have intermediate result with NaN
, which will force the numeric values to change from integer to float.
a = pd.to_numeric(a[a.astype(str).str.isnumeric()])
Result:
The resulting dtype remains as integer type int64
print(a)
0 1
1 2
2 3
3 4
Name: col1, dtype: int64
NaN
like below:a = pd.to_numeric(a, errors='coerce').dropna()
The resulting dtype is forced to change to float
type (instead of remaining as integer)
0 1.0
1 2.0
2 3.0
3 4.0
Name: col1, dtype: float64