rif-statementperiod

Creating averages across time periods


I'm a beginner to R, but I have the below dataframe with more observations in which I have at max each 'id' observation for three years 91, 99, 07.

I want to create a variable avg_ln_rd by 'id' that takes the average of 'ln_rd' and 'ln_rd' from year 91 if the first ln_rd observation is from 99 - and from year 99 if the first ln_rd observation is from 07.

      id  year ln_rd
   <dbl> <dbl> <dbl>
 1  1013  1991 3.51 
 2  1013  1999 5.64 
 3  1013  2007 4.26 
 4  1021  1991 0.899
 5  1021  1999 0.791
 6  1021  2007 0.704
 7  1034  1991 2.58 
 8  1034  1999 3.72 
 9  1034  2007 4.95 
10  1037  1991 0.262

I also already dropped any observations of 'id' that only exist for one of the three years.

My first thought was to create for each year a standalone variable for ln_rd but then i still would need to filter by id which i do not know how to do.

Then I tried using these standalone variables to form an if clause.

df$lagln_rd_99 <- ifelse(df$year == 1999, df$ln_rd_91, NA)

But again I do not know how to keep 'id' fixed.

Any help would be greatly appreciated.

EDIT:

I grouped by id using dplyr. Can I then just sort my df by id and create a new variable that is ln_rd but shifted by one row?


Solution

  • Still a bit unclear what to do if all years are present in a group but this might help.

    -- edited -- to show the desired output.

    library(dplyr)
    
    df %>% 
      group_by(id) %>% 
      arrange(id, year) %>% 
      mutate(avg91 = mean(c(ln_rd[year == 1991], ln_rd[year == 1999])), 
             avg99 = mean(c(ln_rd[year == 1999], ln_rd[year == 2007])), 
             avg91 = ifelse(year == 1991, avg91, NA), 
             avg99 = ifelse(year == 2007, avg99, NA)) %>% 
      ungroup()
    # A tibble: 15 × 5
        year    id  ln_rd  avg91 avg99
       <int> <int>  <dbl>  <dbl> <dbl>
     1  1991  3505 3.38    3.09  NA
     2  1999  3505 2.80   NA     NA
     3  1991  4584 1.45    1.34  NA
     4  1999  4584 1.22   NA     NA
     5  1991  5709 1.90    2.13  NA
     6  1999  5709 2.36   NA     NA
     7  2007  5709 3.11   NA      2.74
     8  2007  9777 2.36   NA      2.36
     9  1991 18729 4.82    5.07  NA
    10  1999 18729 5.32   NA     NA
    11  2007 18729 5.53   NA      5.42
    12  1991 20054 0.588   0.307 NA
    13  1999 20054 0.0266 NA     NA
    14  1999 62169 1.91   NA     NA
    15  2007 62169 1.45   NA      1.68