rdataframefilterpipegroup

Filtering a dataframe in Base R 4.3 using the native pipe


I download a monthly time series of unemployment rates from the Federal Reserve using alfred

df <- alfred::get_alfred_series("UNRATE")

As unemployment data is later revised after its first release, df contains every single observation, revised and unrevised, of UNRATE along with the date on which the revision was posted.

> head(df)
        date realtime_period UNRATE
1 1948-01-01      1960-03-15    3.5
2 1948-02-01      1960-03-15    3.8
3 1948-03-01      1960-03-15    4.0
4 1948-04-01      1960-03-15    4.0
5 1948-05-01      1960-03-15    3.6
6 1948-06-01      1960-03-15    3.8

I'm looking to filter the dataframe to find the first realtime_period associated with each date, and can do it with dplyr:

df |>
    mutate(Delta = realtime_period - date) |>
    group_by(date) |>
    filter(Delta == min(Delta)) |>
    ungroup()

Question: How do I do this in base R (I'm using R 4.3.3) instead of using dplyr? I'm trying to avoid the tidyverse and stick with base R for consistency as its syntax rarely changes.

Sincerely

Thomas Philips


Solution

  • You can replace mutate with transform, and replace grouped filter with subset + ave.

    df |>
      transform(Delta = abs(realtime_period - date)) |>
      subset(Delta == ave(Delta, date, FUN = min))
    

    transform and subset are both from {base}. ave is from {stats} that is still an internal package of R.