rtidyverselmtidybroom

How to extract adjusted r squared values from multiple linear models run in a loop with tidy() and do()


I found this cool piece of code in the posit community but I cannot ask a follow up question so I am coming here.

I would like to get the adjusted r-squared value from this 'loop' but I cannot manage to do so, so I decided to ask the group. The code is the following.

library(tidyverse)
library(broom)

iris  %>% 
  group_by(Species) %>% 
  do(tidy(lm(Sepal.Length ~ Sepal.Width, .))) %>%
  filter(term != "(Intercept)")

Solution

  • To see the metrics of each model you need to use glance instead of tidy, here:

    Code

    library(tidyverse)
    library(broom)
    
    iris  %>% 
      group_by(Species) %>% 
      do(glance(lm(Sepal.Length ~ Sepal.Width, .))) 
    

    Output

    # A tibble: 3 x 13
    # Groups:   Species [3]
      Species    r.squared adj.r.squared sigma statistic  p.value    df logLik   AIC   BIC deviance df.residual  nobs
      <fct>          <dbl>         <dbl> <dbl>     <dbl>    <dbl> <dbl>  <dbl> <dbl> <dbl>    <dbl>       <int> <int>
    1 setosa         0.551         0.542 0.239      59.0 6.71e-10     1   1.73  2.53  8.27     2.73          48    50
    2 versicolor     0.277         0.262 0.444      18.4 8.77e- 5     1 -29.3  64.6  70.3      9.44          48    50
    3 virginica      0.209         0.193 0.571      12.7 8.43e- 4     1 -41.9  89.9  95.6     15.7           48    50