In R you can essentially write model='Lottery ~ (Literacy + Wealth + Region)^k'
and get every k-way combination of those variables.
statsmodels
supports some R style OLS regressions but they don't seem to support the ^k
syntax. I have a large dataset, large enough where it is prohibitive to the practice of manually trying combinations of variables, and am essentially looking for a way to automate the interaction effect search.
Formulas are handled by patsy
and not by statsmodels directly.
According to patsy documentation using power (a + b + c + d) ** 3
works for interaction effects of categorical variables.
See section for **
in https://patsy.readthedocs.io/en/latest/formulas.html#the-formula-language
Aside: power in Python is **
and not ^