I've got a script designed to create a glm and use it to normalise a dataset between 0 and 1, after which I make a graph to display the relationship. I've been doing this for multiple datasets and the line is always curved (like the first graph), but for this one particular dataset, the curve is just 3 straight lines (second graph). I'm guessing it's something to do with the newdata
in predict
, but I'm not sure.
My code:
# turn off scientific notation
options(scipen = 999)
# recreating the data
IV_BP <- structure(list(Breakpoints = c("Min", "BP1", "BP2", "BP3", "BP4", "Max"),
SES = c(-1.8, -0.3, -0.1, 0.1, 0.3, 0.8),
Normalised_value = c(0,0.2, 0.4, 0.6, 0.8, 1)),
class = "data.frame", row.names = c(NA, -6L))
IV_df <- structure(list(SES = c(-0.006, 0.078, 0.028, -0.066, 0.041, -0.025,
0.006, -0.021, -0.013, -0.145, -0.065, 0.026, 0.068, -0.22, 0.138,
0.019, 0.174, 0.107, 0.339, 0.219, 0.093, -0.057, -0.19, 0.01,
0.085, -0.011, -0.075, -0.113, -0.019, 0.141, -0.045, -0.258,
-0.02, -0.178, -0.142, -0.067, 0.1, -0.155, 0.007, -0.18, -0.258,
-0.497)), class = "data.frame", row.names = c(NA, -42L))
# make glm
glmfit <- glm(Normalised_value~SES,data=IV_BP,family = quasibinomial())
# use glm to transform values
IV_df$CC_Transformed <- predict(glmfit,newdata=IV_df,type="response")
# make a graph
plot(IV_BP$SES, IV_BP$Normalised_value,
xlab = "Socioeconomic Status Index Score",
ylab = "Normalised Values",
xlim = c(-2, 2),
pch = 19,
col = "blue",
panel.first =
c(abline(h = 0, col = "lightgrey"),
abline(h = 0.2, col = "lightgrey"),
abline(h = 0.4, col = "lightgrey"),
abline(h = 0.6, col = "lightgrey"),
abline(h = 0.8, col = "lightgrey"),
abline(h = 1, col = "lightgrey"),
lines(-2:2,predict(glmfit,newdata=data.frame(SES=-2:2),type="response"),
col = "lightblue",
lwd = 5)))
Your x values -2:2
resolution is not enough to give you the curve. Increase the resolution with seq
by steps of 0.1.
And plot the line first, then overplot the points.
# make glm
glmfit <- glm(Normalised_value ~ SES, data = IV_BP, family = quasibinomial())
pred_df <- data.frame(SES = seq(-2, 2, by = 0.1))
pred_df$CC_Transformed <- predict(glmfit, newdata = pred_df, type = "response")
# make a graph
plot(CC_Transformed ~ SES, data = pred_df,
type = "l",
xlab = "Socioeconomic Status Index Score",
ylab = "Normalised Values",
xlim = c(-2, 2),
lwd = 5,
col = "lightblue",
panel.first = c(abline(h = 0, col = "lightgrey"),
abline(h = 0.2, col = "lightgrey"),
abline(h = 0.4, col = "lightgrey"),
abline(h = 0.6, col = "lightgrey"),
abline(h = 0.8, col = "lightgrey"),
abline(h = 1, col = "lightgrey")))
points(Normalised_value ~ SES, data = IV_BP, pch = 19, col = "blue")