I have created a dataframe from a CSV file and now I'm trying to create a cross-tab of two columns ("Personal_Status" and "Gender"). The output should look like this including the frequencies of each gender/personal status combination and the totals of each generated row and column.
Personal Status F M (blank) Grand Total
Divorced 135 21 156
Married 36 36
Single 232 1 233
Grand Total 135 289 1 425
I tried this but it's not quite where I want it:
creditData[["Personal_Status", "Gender"]].value_counts()
The output includes a column of each "Personal_Status" value, a "Gender" column, and the frequency of each combination, i.e row 1 = "Single, M, 232"
Something like this?
import pandas as pd
df = pd.DataFrame({
'Name': ['Kathy', 'Linda', 'Peter'],
'Gender': ['F', 'F', 'M'],
'Personal_Status': ['Divorced', 'Married', 'Married']})
df2 = pd.crosstab(df['Personal_Status'], df['Gender'])
df2.loc['Grand Total']= df2.sum(numeric_only=True, axis=0)
df2.loc[:, 'Grand Total'] = df2.sum(numeric_only=True, axis=1)
print(df2)
Output
Gender F M Grand Total
Personal_Status
Divorced 1 0 1
Married 1 1 2
Grand Total 2 1 3