This might be a bit of a dumb question, but roaming around SO and other websites I can't find a straightforward answer: I've got data on the relationship between age and a continuous outcome:
library(dplyr)
library(tidyverse)
library(magrittr)
mydata <-
structure(list(ID = c(104, 157, 52, 152, 114, 221, 320, 125,
75, 171, 80, 76, 258, 82, 142, 203, 37, 92, 202, 58, 194, 38,
4, 137, 25, 87, 40, 117, 21, 255, 277, 315, 96, 134, 185, 94,
3, 153, 172, 65, 279, 209, 60, 13, 154, 160, 24, 29, 159, 213,
127, 74, 48, 126, 184, 132, 61, 141, 27, 49, 8, 39, 164, 162,
34, 205, 179, 119, 77, 135, 138, 165, 103, 253, 14, 20, 310,
84, 30, 273, 22, 105, 262, 116, 86, 83, 145, 31, 95, 51, 81,
271, 36, 50, 189, 2, 115, 7, 197, 54), age = c(67.1, 70.7, 53,
61.7, 66.1, 57.7, 54.1, 67.2, 60.9, 55.8, 40.7, 57.6, 64.1, 70.7,
47.5, 46.3, 66.7, 55, 63.3, 68.2, 61.2, 60.5, 52, 65.3, 48.9,
56.9, 62.7, 75.2, 61.4, 57.9, 53.6, 58.1, 51, 67.3, 63.9, 57,
43.2, 64.7, 62.8, 56.3, 51.7, 39.4, 45.2, 57.8, 55.7, 69.6, 61.5,
50.1, 73.7, 55.5, 65.2, 54.6, 49, 35.2, 52.9, 46.3, 55, 52.5,
54.2, 61, 57.4, 56.5, 53.6, 47.7, 64.2, 53.4, 60.9, 58.2, 60.7,
50.3, 48.3, 74.7, 52.1, 59.9, 52.4, 70.8, 61.2, 66.5, 55.4, 57.5,
59.2, 60.1, 52.3, 60.2, 54.8, 36.3, 61.5, 48.6, 56, 62, 64.8,
40.4, 68.3, 60, 69.1, 56.6, 45.3, 58.5, 52.3, 52), continuous_outcome = c(3636.6,
1128.2, 2007.5, 802.9, 332.3, 2636.1, 169.5, 67.9, 3261.8, 1920.3,
155.2, 1677.2, 198.2, 11189.7, 560.9, 633.1, 196.1, 13.9, 100.7,
7594.5, 1039.8, 83.9, 2646.8, 284.6, 306, 1135.6, 1883.1, 5681.4,
1706.2, 2241.1, 97.7, 1106.8, 1107.1, 290.8, 2123.4, 267, 115.3,
138.5, 152.7, 1338.9, 6709.8, 561.7, 1931.7, 3112.4, 1876.3,
3795.9, 5706.7, 7.4, 1324.9, 4095.4, 205.4, 1886, 177.3, 304.4,
1319.1, 415.9, 537.2, 3141.1, 740, 1976.7, 624.8, 983.1, 1163.5,
1432.6, 3730.4, 2023.4, 498.2, 652.5, 982.7, 1345.3, 138.4, 1505.1,
3528.1, 11.9, 884.5, 10661.6, 1911.4, 2800.8, 81.5, 396.4, 409.1,
417.3, 186, 1892.4, 1689.7, 0, 210.1, 210.5, 3484.5, 3196.8,
57.2, 20.2, 947, 540, 1603.1, 1571.8, 9.1, 149.2, 122, 63.2)), row.names = c(NA,
-100L), class = c("tbl_df", "tbl", "data.frame"))
As you can see in the tibble, age is a continuous variable measured to precision of 1 decimal place:
head(mydata)
# A tibble: 6 x 3
ID age continuous_outcome
<dbl> <dbl> <dbl>
1 104 67.1 3637.
2 157 70.7 1128.
3 52 53 2008.
4 152 61.7 803.
5 114 66.1 332.
6 221 57.7 2636.
When I fit a simple linear regression (for now assuming all assumptions are not-violated) I get the following beta-coefficient:
fit <-
lm(formula=continuous_outcome ~ age,
data=mydata)
fit
Call:
lm(formula = continuous_outcome ~ age, data = mydata)
Coefficients:
(Intercept) age
-3400.12 86.06
The beta-coefficient for age is 86.06
. Does this mean that, as age is measured to 1 decimal place, that for every 0.1 years increase my outcome increases by 86.06? If so, how do I rescale age
so that I am measuring the effect of age per, for example, 5 years or 10 years?
Thanks in advance!
The beta coefficient shows the amount that the dependent variable (DV, in this case continuous_outcome
) will increase for every one unit increase in your independent variable (IV, in this case age
in years).
If you want to show the relationship per 1/10th of a year, multiply your age
column before fitting the model, or divide the beta coefficient by 10.
For your specific requests, since the beta coefficient is 86.06, you can multiply this by the number of years to get the increase of the continuous variable. So:
To answer the last question (The estimate for the effect of age per 5 years), that would be 430.3, which is 86.06 * 5. So for every 5 years that a persons age
increases, the continuous_outcome
increases by 430.3 on average.