rtsibble

How to set index and key when creating tsibble from existing data?


Problem I want to create a tsibble but it's unclear how to set the key and how to set the index. Not a tibble.

I tried the following:

ts <- t %>%
  as_tsibble(
    key = t$key_field, 
    index = c(t$date, to$state), 
    regular = FALSE
  )

But got the error:

Error: Can't subset columns that don't exist.
x Columns `AA`, `AA`, `AA`, `AA`, =etc. don't exist.

There is NO columns named 'AA', this `AA`, `AA`, `AA`, `AA` is actually the data!

Example Needed Would appreciate a real world tsibble created from data collection from another existing data.frame source.

Discussion Needed Discussion on creating a tsibble and define an index, key using existing data. The code I provided complains of

** Data **

Region (chr), State (chr), Purpose (chr), Trips (dbl), Year (dbl)

1998 Q1 Adelaide    South Australia Business    135.0776903 1998
1998 Q2 Adelaide    South Australia Business    109.9873160 1998
1998 Q3 Adelaide    South Australia Business    166.0346866 1998
1998 Q4 Adelaide    South Australia Business    127.1604643 1998
1999 Q1 Adelaide    South Australia Business    137.4485333 1999
1999 Q2 Adelaide    South Australia Business    199.9125861 1999
1999 Q3 Adelaide    South Australia Business    169.3550898 1999
1999 Q4 Adelaide    South Australia Business    134.3579372 1999
2000 Q1 Adelaide    South Australia Business    154.0343980 2000
2000 Q2 Adelaide    South Australia Business    168.7763637 2000
2005 Q3 Australia's Coral Coast Western Australia   Business    28.6365371  2005
2005 Q4 Australia's Coral Coast Western Australia   Business    26.4668880  2005
2006 Q1 Australia's Coral Coast Western Australia   Business    19.0804140  2006
2006 Q2 Australia's Coral Coast Western Australia   Business    25.8851570  2006
2006 Q3 Australia's Coral Coast Western Australia   Business    35.5701650  2006
2006 Q4 Australia's Coral Coast Western Australia   Business    16.8853340  2006
2007 Q1 Australia's Coral Coast Western Australia   Business    34.5039748  2007
2007 Q2 Australia's Coral Coast Western Australia   Business    21.6070762  2007
2007 Q3 Australia's Coral Coast Western Australia   Business    38.6497565  2007
2007 Q4 Australia's Coral Coast Western Australia   Business    26.0811098  2007

Here's another attempt with a different error:

TSibble execution for index, key, error message


Solution

  • To create a tsibble put data in the first position. Refer to columns with unquoted column names. The key + index need to uniquely define a row. The index should be a single column that represents your time unit.

    library(ggplot2)
    library(tsibble)
    
    # txhousing is data that comes with ggplot2 and has a timeseries
    # of housing sales data for each city in the state of Texas.
    data(txhousing)
    
    as_tsibble(txhousing, key = city, index = date)
    # or 
    tsibble(txhousing, key = city, index = date)
    

    Given how your data is structured you could use the following code. Where I assume region, state and purpose with quarter uniquely identify a row in your data.

    as_tsibble(tourism, key = c(region, state, purpose), index = quarter)
    
    # if there are years where you are missing quarters 
    # or if you are missing years in the middle of a time series
    # use regular = FALSE
    as_tsibble(tourism, key = c(region, state, purpose), index = quarter, regular = FALSE)
    

    Here quarter is of type yearquarter which you can create using tsibble::make_quarter(year, q) where year and q are integers.