how could I retrieve the value count of a particular data type ? Tried several ways with index label, end up in key error.
To get the result, ended up creating a new dataframe with the datatype name as column name, there should be some other efficient way.
df = pd.DataFrame(
[['Bob', 20, 2],['Alice', 19, 3],['Joshua', 22, 1]],
columns = ['Name', 'Age', 'Marks']
)
strdtypes = df.dtypes.value_counts()
strIndex = strdtypes.keys().tolist()
strIndex = [str(d) for d in strIndex]
df1 =pd.DataFrame({'datatype':strIndex, 'valuecount':strdtypes})
count =df1[df1['datatype']=='int64']['valuecount']
print ("count int64 ", count)
Your issue is that when you do:
counts = df.dtypes.value_counts()
The index of counts
is comprised of dtype
objects. To be able to easily access those values, you need to convert the index to the string representation of the objects, which you can access via their name
property. For example:
df = pd.DataFrame(
[['Bob', 20, 2],['Alice', 19, 3],['Joshua', 22, 1]],
columns = ['Name', 'Age', 'Marks']
)
counts = df.dtypes.value_counts()
print(counts)
# int64 2
# object 1
# Name: count, dtype: int64
print(counts['int64'], counts['object'])
# KeyError: 'int64'
counts.index = [dt.name for dt in counts.index]
print(counts['int64'], counts['object'])
# 2 1
Alternatively, as pointed out by @mozway in the comments, you can just convert the dtype values using astype
:
counts = df.dtypes.astype(str).value_counts()
print(counts['int64'], counts['object'])
# 2 1