I'm trying to display the results of a logistic regression. My model was fit using glmer() from the lme4 package, I then used MuMIn for model averaging.
Simplified version of my model using the mtcars
dataset:
glmer(vs ~ wt + am + (1|carb), database, family = binomial, na.action = "na.fail")
My desired output is two plots that show the predicted probability that vs
=1, one for wt
, which is continuous, one for am
, which is binomial.
I got this much working after comments from @KamilBartoń:
database <- mtcars
# Scale data
database$wt <- scale(mtcars$wt)
database$am <- scale(mtcars$am)
# Make global model
model.1 <- glmer(vs ~ wt + am + (1|carb), database, family = binomial, na.action = "na.fail")
# Model selection
model.1.set <- dredge(model.1, rank = "AICc")
# Get models with <10 delta AICc
top.models.1 <- get.models(model.1.set,subset = delta<10)
# Model averaging
model.1.avg <- model.avg(top.models.1)
# make dataframe with all values set to their mean
xweight <- as.data.frame(lapply(lapply(database[, -1], mean), rep, 100))
# add new sequence of wt to xweight along range of data
xweight$wt <- (wt = seq(min(database$wt), max(database$wt), length = 100))
# predict new values
yweight <- predict(model.1.avg, newdata = xweight, type="response", re.form=NA)
# Make plot
plot(database$wt, database$vs, pch = 20, xlab = "WEIGHT (g)", ylab = "VS")
# Add predicted line
lines(xweight$wt, yweight)
Produces:
The remaining issue is that the data are scaled and centred around 0, meaning interpretation of the graph is impossible. I'm able to unscale the data using an answer from @BenBolker to this question but this does not display correctly:
## Ben Bolker's unscale function:
## scale variable x using center/scale attributes of variable y
scfun <- function(x,y) {
scale(x,
center=attr(y,"scaled:center"),
scale=attr(y,"scaled:scale"))
}
## scale prediction frame with scale values of original data -- for all variables
xweight_sc <- transform(xweight,
wt = scfun(wt, database$wt),
am = scfun(am, database$am))
# predict new values
yweight <- predict(model.1.avg, newdata = xweight_sc, type="response", re.form=NA)
# Make plot
plot(mtcars$wt, mtcars$vs, pch = 20, xlab = "WEIGHT (g)", ylab = "VS")
# Add predicted line
lines(xweight$wt, yweight)
Produces:
I've tried this a few different ways but can't work out what the problem is. What have I done wrong?
Also, another remaining issue: How do I make a binomial plot for am
?
library(lme4)
library(MuMIn)
database <- mtcars
database$wt <- scale(mtcars$wt)
database$am <- scale(mtcars$am)
model.1 <- glmer(vs ~ wt + am + (1|carb), database, family = binomial, na.action = "na.fail")
model.1.set <- dredge(model.1, rank = "AICc")
top.models.1 <- get.models(model.1.set,subset = delta<10)
model.1.avg <- model.avg(top.models.1)
The problem at hand seems to be creating a graph of the average effect similar to the effects
package (or the ggeffects
package). Thomas got pretty close, but a small misunderstanding of Ben Bolkers answer, has led to inverting the scaling process, which in this case led to double scaling of parameters. This can be seen illustrated below by snippeting out the code above.
database$wt <- scale(mtcars$wt)
database$am <- scale(mtcars$am)
# More code
xweight <- as.data.frame(lapply(lapply(database[, -1], mean), rep, 100))
xweight$wt <- (wt = seq(min(database$wt), max(database$wt), length = 100))
# more code
scfun <- function(x,y) {
scale(x,
center=attr(y,"scaled:center"),
scale=attr(y,"scaled:scale"))
}
xweight_sc <- transform(xweight,
wt = scfun(wt, database$wt),
am = scfun(am, database$am))
From this we see that xweight
is actually already scaled, and thus the second time scaling is used, we obtain
sc <- attr(database$wt, 'scaled:scale')
ce <- attr(database$wt, 'scaled:center')
xweight_sc$wt <- scale(scale(seq(min(mtcars$wt), max(mtcars$wt), ce, sc), ce, sc)
What Ben Bolker is talking about in his answer however, is the situation where a model uses scaled predictors while the data used for prediction was not. In this case the data is scaled correctly, but one wishes to interpret it for the original scale. We simply have to invert the process. For this one could use 2 methods.
note: One could use custom labels in xlab
in base R.
One method for changing the axis is to.. change the axis. This allows one to keep the data and only rescale the labels.
# Extract scales
sc <- attr(database$wt, 'scaled:scale')
ce <- attr(database$wt, 'scaled:center')
# Create plotting and predict data
n <- 100
pred_data <- aggregate(. ~ 1, data = mtcars, FUN = mean)[rep(1, 100), ]
pred_data$wt <- seq(min(database$wt), max(database$wt), length = n)
pred_data$vs <- predict(model.1.avg, newdata = pred_data, type = 'response', re.form = NA)
# Create breaks
library(scales) #for pretty_breaks and label_number
breaks <- pretty_breaks()(pred_data$wt, 4) #4 is abritrary
# Unscale the breaks to be used as labels
labels <- label_number()(breaks * sc + ce) #See method 2 for explanation
# Finaly we plot the result
library(ggplot2)
ggplot(data = pred_data, aes(x = wt, y = vs)) +
geom_line() +
geom_point(data = database) +
scale_x_continuous(breaks = breaks, labels = labels) #to change labels.
which is the desired result. Note that there is no confidence bands, that is due to the lack of a closed-form for the confidence intervals for the original model, and it seems likely that the best method to get any estimate at all, is to use bootstrapping.
In unscaling we simply invert the process of scale
. As scale(x)= (x - mean(x))/sd(x)
we simply have to isolate x: x = scale(x) * sd(x) + mean(x)
, and this is the process to be done, but still remember to use the scaled data during prediction:
# unscale the variables
pred_data$wt <- pred_data$wt * sc + ce
database$wt <- database$wt * sc + ce
# Finally plot the result
ggplot(data = pred_data, aes(x = wt, y = vs)) +
geom_line() +
geom_point(data = database)
which is the desired result.