rlmpredictcoefficientspoly

modifying lm coefficients - why is predict output shifted like it had modified intercept?


I have a lm model with mortality data depending on daily temperature. To estimate a possible adaptation to climate change, I would like to reduce the slope of the curve by 10%. Therefore, I modified the slope-coefficients of the lm model by multiplying with 0.9.

However, the predict output of this modified model is unexpected. The slope is declined, but the curves don´t meet by x=0 but by the intercept of approx. 133. That is the next question, why ist this intercept not 0?

Here is a reproducible example:

x <- seq(0, 20, 0.1)
y <- seq(0, 20, 0.1)^2

mod <- lm(y ~ poly(x, 2))
mod$coefficients
(Intercept) poly(x, 2)1 poly(x, 2)2 
133.6667   1645.2355    426.9008  

mody <- mod
mody$coefficients[2] <- mody$coefficients[2]*0.9
mody$coefficients[3] <- mody$coefficients[3]*0.9
mody$coefficients
Intercept) poly(x, 2)1 poly(x, 2)2 
133.6667   1480.7120    384.2108 

plot(x, predict(mod), type="l")
lines(x, predict(mody), col="red")

see plot here

I have tried to find the reason of the shifted output and I assume it is somewhere in the predict() function. To check the modified coefficients, I tried this, and it shows the expexted output.

curve(coef(mod)[1] + coef(mod)[2] * x + coef(mod)[3] * x^2, from=0, to=20, xlab="x", ylab="y")
curve(coef(mody)[1] + coef(mody)[2] * x + coef(mody)[3] * x^2, from=0, to=20,xlab="x", ylab="y", add = T)

see curve plot here

Wy is the predict output different?

Why is the Intercept of the example not 0?

Or how can I "predict" the modified data "by hand" without using predict()?

Thanks for your help!

EDIT: The answer of DaveArmstrong solved the problem for the first example, by using raw=TRUE in poly(). However, with other data, it is still not working maybe due to negative coefficients in the model (?)

here an example of my original data :

x <- seq(15.0, 23.5, 0.1)
y <- c(0.992, 0.998, 1.012, 1.013, 1.015, 1.021, 1.028, 1.027, 1.023, 1.029, 1.032, 1.032, 1.029, 1.036, 1.035, 1.041, 1.043, 1.043, 1.037, 1.037, 1.039, 1.037, 1.041, 1.047, 1.047, 1.048, 1.045, 1.048, 1.044, 1.037, 1.046, 1.042, 1.037, 1.034, 1.032, 1.031, 1.030, 1.034,
1.044, 1.046, 1.036, 1.034, 1.049, 1.050, 1.037, 1.041, 1.046, 1.062, 1.077, 1.084, 1.091, 1.106, 1.114, 1.127, 1.120, 1.122, 1.130,
1.122, 1.135, 1.164, 1.187, 1.186, 1.195, 1.201, 1.197, 1.204, 1.201, 1.205, 1.203, 1.200, 1.205, 1.232, 1.218, 1.218, 1.249, 1.245,
1.253, 1.227, 1.232, 1.252, 1.258, 1.254, 1.248, 1.245, 1.261, 1.289)

org <- lm(y ~ poly(x, 2, raw=TRUE))
coef(org)
(Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 
            2.240583377            -0.153426285             0.004822839 
  
orgm <- org
orgm$coefficients[2] <- orgm$coefficients[2]*1.1 #reducing negative coefficient
orgm$coefficients[3] <- orgm$coefficients[3]*0.9

plot(predict(org), ylim=c(0,1.5), type="l")
lines(predict(orgm), col="red")
legend("topleft", legend=c("Original", "Modified"), col=c("black", "red"), lty=c(1,1))

In the resulting plot (plot), the modified model is somehow shifed to lower y-values and the slope also semms not correct. Why is this still unexpected?


Solution

  • I think the problem is that the poly() function by default orthogonalizes the polynomial regressors. In your example, there is really only a relationship between the squared term in the data. If you did this with the raw polynomials instead, it should work.

    x <- seq(0, 20, 0.1)
    y <- seq(0, 20, 0.1)^2
    
    mod <- lm(y ~ poly(x, 2, raw=TRUE))
    mod$coefficients
    # (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 
    # -6.961533e-14            1.658415e-14            1.000000e+00 
    
    mody <- mod
    mody$coefficients[2] <- mody$coefficients[2]*0.9
    mody$coefficients[3] <- mody$coefficients[3]*0.9
    mody$coefficients
    # (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 
    # -6.961533e-14            1.492574e-14            9.000000e-01 
    
    plot(x, predict(mod), type="l")
    lines(x, predict(mody), col="red")
    legend("topleft", legend=c("Original", "Modified"), col=c("black", "red"), lty=c(1,1))
    

    enter image description here

    For a bit more context, here is how in this example the orthogonalized polynomials relate to the raw polynomials (the first column gives the coefficients relating the raw polynomials to the first orthogonalized polynomial regressor and the second column gives the coefficients relating the raw polynomials to the second-order orthogonalized polynomial regressor).

    p2 <- poly(x, 2)
    round(coef(lm(p2 ~ poly(x, 2, raw=TRUE))), 5)
    #                                1        2
    # (Intercept)             -0.12156  0.15538
    # poly(x, 2, raw = TRUE)1  0.01216 -0.04685
    # poly(x, 2, raw = TRUE)2  0.00000  0.00234
    
    

    Substituting these into the equation with the orthogonal polynomials, you would get the following (where enter image description here represents the orthogonalized regressor):

    enter image description here

    When you multiply the orthogonalized polynomial coefficients by 0.9, you're doing the following:

    enter image description here

    With respect to the original variables, you're also changing the intercept when you modify the coefficients on the orthogonalized regressors.


    Edit: Modify answer to deal with real data

    The solution above worked because the relationship of interest was relatively simple - the intercept and coefficient for the first order term were both approximately zero. When this isn't the case, the answer is somewhat more complex. In the real-data example proposed above, the x variable has a minimum value of 15. My assumption is that we would want the two curves to meet at 15, but have the modified one with a shallower slope. For this, we need to think about what this means in terms of the original and modified coefficients. In particular, we would need the two equations to produce the same prediction when x=15. Using b to represent the original coefficients and b' to represent the modified coefficients, what we would want is the following to be true:

    Doing the bit of algebra out, you would get:

    So, implementing this, let's say you multiplied the coefficient on the first order polynomial term by .9, this would give:

    orgm <- org
    orgm$coefficients[2] <- orgm$coefficients[2]*0.9 
    orgm$coefficients[2]
    # poly(x, 2, raw = TRUE)1 
    # -0.1379442 
    

    We could then calculate the difference between the original and modified coefficients:

    diff <- org$coefficients[2] - orgm$coefficients[2]
    diff
    # poly(x, 2, raw = TRUE)1 
    # -0.01532713 
    

    Finally, we could plug this and the original coefficient on the second-order polynomial regressor into the formula to create the modified second-order polynomial regressor coefficient:

    orgm$coefficients[3] <- diff/15 + org$coefficients[3] 
    orgm$coefficients
    # (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 
    # 2.239156804            -0.137944190             0.003796868 
    

    Then, we could make the plot:

    plot(x, predict(org), ylim=c(0,1.5), type="l")
    lines(x, predict(orgm), col="red")
    legend("topleft", legend=c("Original", "Modified"), col=c("black", "red"), lty=c(1,1))
    

    I think this is the result you're looking for:

    enter image description here