I am testing the association between crime rates and economic inequality. In attempting to determine the best-fit model, I was using Akaike's Information Criterion (AIC). I also may want to use Bayesian Information Criterion (BIC). I used this code for AIC:
step(burglary_intercept_model , direction = 'forward' , scope =
formula(burglary_income_model) , trace = 1 , k = log(nrow(S1901Income2022)))
where "burglary_intercept_model" is an object with just burglary rates considered, and no other variables, "burglary_income_model" is a regression model considering all income brackets in my dataset, and "S1901Income2022" is the dataset with those income bracket variables and their values.
Although R produced what it called an AIC number, "k = log(nrow(S1901Income2022))" indicates that I'm actually using BIC, right? I then edited this code in this way:
step(burglary_intercept_model , direction = 'forward' , scope =
formula(burglary_income_model) , trace = 1 , k = 2)
Note now that k = 2. That makes the output based on AIC, correct?
Yes, using k = log(n)
gives you the BIC. This is stated in the documentation (?step
)
k: the multiple of the number of degrees of freedom used for the penalty. Only ‘k = 2’ gives the genuine AIC: ‘k = log(n)’ is sometimes referred to as BIC or SBC.