rloopsregressionapplylm

in R regress all columns against a single vector and store regression coefficients & R-squared values


test_data <- cbind(Fund1 = c(NA, NA, NA,1,5,6,7,8,9,10),
                   Fund2 = c(NA, 1,2,4,5,6,7,5,NA,NA),
                   Fund3 = c(NA,2,4,5,6,7,5,4,NA,NA),
                   Fund4 = c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
peer_average <- rowMeans(as.data.frame(test_data), na.rm = TRUE)
test_data <- cbind(test_data, data.frame(peer_average))

I want to perform a regression of Fund 1, Fund 2 and Fund 3 against peer_average. In practice, I have a much larger dataframe, so want to make this simple to extend.

My goal is to have output_matrix1 = 4 Beta Coffecients, output_matrix2 = 4 alpha coefficients and output_matrix3 = 4 r-squared values, from each respective regression.

I know to run individual regressions as: lm(y ~ x), but I am not sure how to address NA's in the series. Originally I thought lapply would work but have not been able to figure it out. I want regressions to be calculated on all pairwise overlapping series. The problem with running lm(y~x) in this case, different to previous examples: Fitting a linear model with multiple LHS is Fund4


Solution

  • Here is how I would do it. Similar to the other answer, I would go from wide to long format (one function), then I would nest, map out the regressions, and pull out the coefficients:

    library(tidyverse)
    
    pivot_longer(test_data, 
                 -peer_average, 
                 names_to = "Fund", 
                 names_pattern = "Fund(\\d+)", 
                 values_drop_na = TRUE) |>
      nest(model = -Fund) |>
      mutate(model = map(model, ~summary(lm(peer_average~value, data = .x))),
             R2 = map_dbl(model, ~.x$r.squared),
             model = map(model, ~broom::tidy(.x)$estimate),
             model = map(model, ~set_names(.x, c("Intercept", "Coef"))))  |>
      unnest_wider(model)
    #> # A tibble: 3 x 4
    #>   Fund  Intercept  Coef    R2
    #>   <chr>     <dbl> <dbl> <dbl>
    #> 1 2         0.880 0.845 0.915
    #> 2 3         0.273 0.897 0.580
    #> 3 1         2.12  0.677 0.805