My dataframe looks like this
df = structure(list(Date_Time_GMT_3 = structure(c(1625025600, 1625026500,1625027400, 1625028300, 1625029200, 1625030100),
class = c("POSIXct", "POSIXt"), tzone = "EST"),
X20676887_X2LH_S = c(26.879, 26.781,26.683, 26.585, 26.488, 26.39),
X20819831_11LH_S = c(26.39, 26.292, 26.195, 26.195, 26.097, 26),
X20822214_X4LH_S = c(26.39, 26.292,26.292, 26.195, 26.097, 26),
LH27_20822244_U_Stationary = c(23.388, 23.292, 23.292, 23.196, 23.196, 23.196)),
row.names = 2749:2754, class = "data.frame")
and I'm trying to get the linear regression equations and R squared values for all columns where the column with the string
"Stationary"
in it will always be on the x-axis.
so far I can perform the linear regression for 1 column against the "stationary"
column
model = lm(df$LH27_20822244_U_Stationary ~
df$X20822214_X4LH_S, df)
and when I use
summary(model)
afterwards it gives me some values I would like in a dataframe (i.e R squared
, Estimate Std.
, Std. Error
, Pr(>|t|)
) but 2 things I need to help with are:
stationary
in the namestationary
in it's name, and I need that to be a dataframe that looks like so... Logger_ID Reg_equation R_Squared Estimate_Std. Std_Error Pr_t..
<chr> <int> <int> <int> <int> <int>
1 X20676887_X2LH_S NA NA NA NA NA
2 X20819831_11LH_S NA NA NA NA NA
3 X20822214_X4LH_S NA NA NA NA NA
Something like this:
library(tidyverse)
library(broom)
df1 %>%
pivot_longer(
cols = starts_with("X")
) %>%
mutate(name = factor(name)) %>%
group_by(name) %>%
group_split() %>%
map_dfr(.f = function(df){
lm(LH27_20822244_U_Stationary ~ value, data = df) %>%
glance() %>%
# tidy() %>%
add_column(name = unique(df$name), .before=1)
})
Using tidy()
name term estimate std.error statistic p.value
<fct> <chr> <dbl> <dbl> <dbl> <dbl>
1 X20676887_X2LH_S (Intercept) 12.8 2.28 5.62 0.00494
2 X20676887_X2LH_S value 0.393 0.0855 4.59 0.0101
3 X20819831_11LH_S (Intercept) 10.4 3.72 2.79 0.0495
4 X20819831_11LH_S value 0.492 0.142 3.47 0.0256
5 X20822214_X4LH_S (Intercept) 10.5 3.30 3.20 0.0329
6 X20822214_X4LH_S value 0.485 0.126 3.86 0.0182
Using glance()
name r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 X20676887_X2~ 0.841 0.801 0.0350 21.1 0.0101 1 12.8 -19.6 -20.3 0.00490 4 6
2 X20819831_11~ 0.751 0.688 0.0438 12.0 0.0256 1 11.5 -17.0 -17.6 0.00766 4 6
3 X20822214_X4~ 0.788 0.735 0.0403 14.9 0.0182 1 12.0 -17.9 -18.6 0.00651 4 6