I am using median() inside of an if-else list comprehension as such:
summary_frame = pd.DataFrame({
# ...
"50%": [df[col].median() if "float" or "int" or "time" in str(df[col].dtype) else df[col].mode() for col in df.columns]
# ...
})
where if the datatype is not numerical, else it should default to mode(), which works when the datatype is not numerical.
however, it tries to convert the series to numerical and throws a TypeError. Is there a way around this?
Change
if "float" or "int" or "time" or in str(df[col].dtype)
to
if any(t in str(df[col].dtype) for t in ("float", "int", "time"))
or
in Python is not like English, it doesn't automatically distribute over the comparison operation. You can't write
if x or y or z in something
That's parsed as
if x or y or (z in something)
Full test code:
>>> df = pd.DataFrame({'a': [1, 2, 3, 2, 3, 6, 10], 'b': ['a', 'x', 'x', 'foo', 'b', 'x', 'y'] })
>>> summary_frame = pd.DataFrame({'50%': [df[col].median() if any(t in str(df[col].dtype) for t in ("float", "int", "time")) else df[col].mode() for col in df.columns]})
>>> summary_frame
50%
0 3.0
1 0 x
Name: b, dtype: object