My dataframe has 4 columns (one dependent variable and 3 independent).
Here's a sample:
My desired output is a contingency table, as follows:
I can only seem to get a contingency table using one independent variable- using the following code (my df is called 'table')
pd.crosstab(index=table['Dvar'],columns=table['Var1'])
I can't seem to be able to add any other variables to this...Is the only way to achieve this to do make a separate contingency table for each var (1 to 3) and then merge/ join them?
First of all, contingency table is for showing correlation between features.
If you want to probably see correlation between independent and dependent features, go through this code:
pd.crosstab([table['Var1'],table['Var2'],table['Var3']],
table['Dvar'], margins = False)
But, as you mention, to get your desired output for that use pandas.DataFrame.groupby
statement as:
table.groupby('Dvar').sum()