First off, I'll give you some reproducible code:
library(ggplot2)
y = c(0, 0, 1, 2, 0, 0, 1, 3, 0, 0, 3, 0, 6, 2, 8, 16, 21, 39, 48, 113, 92, 93 ,127, 159, 137, 46, 238, 132 ,124, 185 ,171, 250, 250 ,187, 119 ,151, 292, 94, 281, 146, 163 ,104, 156, 272, 273, 212, 210, 135, 187, 208, 310, 276 ,235, 246, 190, 232, 254, 446,
314, 402 ,276, 279, 386 ,402, 238, 581, 434, 159, 261, 356, 440, 498, 495, 462 ,306, 233, 396, 331, 418, 293 ,431 ,300, 222, 222, 479 ,501, 702
,790, 681)
x = 1:length(y)
Now, I'm trying to construct a 3rd-degree polynomial regression curve for this dataset. I wanted to know the coefficients of this model, by summary(lm(formula=y~poly(x,3)))
. I get an absurd result back.
Call:
lm(formula = y ~ poly(x, 3))
Residuals:
Min 1Q Median 3Q Max
-253.696 -47.582 -9.709 44.314 271.183
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 223.978 9.703 23.083 <2e-16 ***
poly(x, 3)1 1420.644 91.538 15.520 <2e-16 ***
poly(x, 3)2 62.375 91.538 0.681 0.497
poly(x, 3)3 130.161 91.538 1.422 0.159
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 91.54 on 85 degrees of freedom
Multiple R-squared: 0.7411, Adjusted R-squared: 0.732
F-statistic: 81.12 on 3 and 85 DF, p-value: < 2.2e-16
This is absurdly high for my model, and I'm confused as to why this output is getting returned.
Why is this happening? Where am I going wrong?
I think what you want is:
lm(y ~ poly(x, 3, raw = TRUE))
I hope this helps!