rplotlmr-carmumin

Partial residual plot based on model average coefficients in R


I'm using the R package MuMIn to do multimodel inference and the function model.avg to average the coefficients estimated by a set of models. To visually compare the data to the estimated relationships based on the averaged coefficients, I want to use partial residual plots, similar to the ones created by the crPlots function of the car package. I've tried three ways and I'm not sure whether any is appropriate. Here is a demonstration.

library(MuMIn)
# Loading the data
data(Cement)
# Creating a full model with all the covariates we are interested in
fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
# Getting all possible models based on the covariates of the full model
muModel <- dredge(fullModel)
# Averaging across all models
avgModel <- model.avg(muModel)
# Getting the averaged coefficients
coefMod <- coef(avgModel)
coefMod
# (Intercept)          X1          X2          X4          X3 
# 65.71487660  1.45607957  0.61085531 -0.49776089 -0.07148454 

Option 1: Using crPlots

library(car) # For crPlots
# Creating a duplicate of the fullMode
hackModel <- fullModel
# Changing the coefficents to the averaged coefficients
hackModel$coefficients <- coefMod[names(coef(fullModel))]
# Changing the residuals
hackModel$residuals <- Cement$y - predict(hackModel)
# Plot the hacked model vs the full model
layout(matrix(1:8, nrow=2, byrow=TRUE))
crPlots(hackModel, layout=NA)
crPlots(fullModel, layout=NA)

Notice that the crPlots of the full and hacked versions with the average coefficients are different. crPlots Example

The question here is: Is this appropriate? The results rely on a hack that I found in this answer. Do I need to change parts of the model other than the residuals and the coefficients?

Option 2: Homemade plots

# Partial residuals: residuals(hacked model) + beta*x
# X1
# Get partial residuals
prX1 <- resid(hackModel) + coefMod["X1"]*Cement$X1
# Plot the partial residuals
plot(prX1 ~ Cement$X1)
# Add modeled relationship
abline(a=0,b=coefMod["X1"])
# X2 - X4
plot(resid(hackModel) + coefMod["X2"]*X2 ~ X2, data=Cement); abline(a=0,b=coefMod["X2"])
plot(resid(hackModel) + coefMod["X3"]*X3 ~ X3, data=Cement); abline(a=0,b=coefMod["X3"])
plot(resid(hackModel) + coefMod["X4"]*X4 ~ X4, data=Cement); abline(a=0,b=coefMod["X4"])

The plot looks different from the ones produced by the crPlots above. home made example

The partial residuals have similar patterns, but their values and modeled relationships are different. The difference in values appears to due to the fact that crPlots used centered partial residuals (see this answer for a discussion of partial residuals in R). This brings me to my third option.

Option 3: Homemade plots with centered partial residuals

# Get the centered partial residuals
pRes <- resid(hackModel, type='partial')
# X1
# Plot the partial residuals
plot(pRes[,"X1"] ~ Cement$X1)
# Plot the component - modeled relationship
lines(coefMod["X1"]*(X1-mean(X1))~X1, data=Cement)
# X2 - X4
plot(pRes[,"X2"] ~ Cement$X2); lines(coefMod["X2"]*(X2-mean(X2))~X2, data=Cement) 
plot(pRes[,"X3"] ~ Cement$X3); lines(coefMod["X3"]*(X3-mean(X3))~X3, data=Cement)
plot(pRes[,"X4"] ~ Cement$X4); lines(coefMod["X4"]*(X4-mean(X4))~X4, data=Cement)

Home made example with centered partial residuals

Now we have similar values than the crPlots above, but the relationships are still different. The difference may be related to the intercepts. But I'm not sure what I should use instead of 0.

Any suggestions of which method is more appropriate? Is there a more straightforward way to get the partial residual plots based on model averaged coefficients?

Many thanks!


Solution

  • From looking at the crPlot.lm source code, it looks like only the functions residuals(model, type="partial"), predict(model, type="terms", term=var), and functions associated with finding the names of the variables are used on the model object. It also looks like the relationship is regressed, as @BenBolker suggested. The code used in crPlot.lm is: abline(lm(partial.res[,var]~.x), lty=2, lwd=lwd, col=col.lines[1]). Thus, I think that changing the coefficients and residuals of the model is sufficient to be able to use crPlots on it. I can now also reproduce the results in a homemade way.

    library(MuMIn)
    # Loading the data
    data(Cement)
    # Creating a full model with all the covariates we are interested in
    fullModel <- lm(y ~ ., data = Cement, na.action=na.fail)
    # Getting all possible models based on the covariates of the full model
    muModel <- dredge(fullModel)
    # Averaging across all models
    avgModel <- model.avg(muModel)
    # Getting the averaged coefficients
    coefMod <- coef(avgModel)
    
    # Option 1 - crPlots
    library(car) # For crPlots
    # Creating a duplicate of the fullMode
    hackModel <- fullModel
    # Changing the coefficents to the averaged coefficient
    hackModel$coefficients <- coefMod[names(coef(fullModel))]
    # Changing the residuals
    hackModel$residuals <- Cement$y - predict(hackModel)
    
    # Plot the crPlots and the regressed homemade version 
    layout(matrix(1:8, nrow=2, byrow=TRUE))
    par(mar=c(3.5,3.5,0.5,0.5), mgp=c(2,1,0))
    crPlots(hackModel, layout=NA, ylab="Partial Res", smooth=FALSE)
    
    # Option 4 - Homemade centered and regressed
    # Get the centered partial residuals
    pRes <- resid(hackModel, type='partial')
    # X1 - X4 plot partial residuals and used lm for the relationship
    plot(pRes[,"X1"] ~ Cement$X1); abline(lm(pRes[,"X1"]~Cement$X1))
    plot(pRes[,"X2"] ~ Cement$X2); abline(lm(pRes[,"X2"]~Cement$X2))
    plot(pRes[,"X3"] ~ Cement$X3); abline(lm(pRes[,"X3"]~Cement$X3))
    plot(pRes[,"X4"] ~ Cement$X4); abline(lm(pRes[,"X4"]~Cement$X4))
    

    comparison of crPlots and regressed