rtime-seriesforecastingarimafable-r

How do I fit a forecast into a line graph?


I am at the very end of the forecasting model (Sydney House Prices(only interested in columns: Date and sellPrice)), but I get the error message below;


**Code:**

Data_fit %>%
  forecast(h=5) %>%
  filter(.model=='search') %>%
  autoplot(data3group_ts)

(data3group_ts: dataframe)

**Error Message: **

Error in UseMethod("filter") : 
  no applicable method for 'filter' applied to an object of class "forecast"
In addition: Warning message:
In mean.default(x, na.rm = TRUE) :
  argument is not numeric or logical: returning NA

I am new to this and have tried anything and everything, I don't understand why I cannot get the visual that shows the forecast (next 5 years for Sydney House Prices). Can you please help me see where I am making a mistake? (I am also follwing this book for the topic >>> https://otexts.com/fpp3/arima-r.html)

The full script is below;


library(fpp3)

#Load the data
data <- read.csv("SydneyHousePrices.csv")

# Check null value

is.null(data)

#Selecting certain columns
data2 <- data %>%
select(Date, sellPrice)

#Timeseries dataframe Selection
data2ts >- data2[,c('Date','sellPrice')]
sum(is.na(data2ts))

#converting Date to a date column
data2ts$Date <- as.Date(data2ts$Date)

class(data2ts$Date)

#median Price

data3 <- data2ts %>%
group_by(Date) %>%
summarise(medPrice = median(sellPrice, na.rm = FALSE))

data3

#create a cloumn to show year
data3$Year <- as.Date(cut(data3$Date, breaks = 'year'))

#group the years
data3group <- data3 %>%
group_by(Year) %>%
summarise(sum(medPrice)) %>%
select(Year, "medPrice" = "sum(medPrice)")

#mplotting time series

data3group %>%
ggplot(aes(x = Year, y= medPrice)) +
geom_line(color = "blue", size = 1) +
scale_y_continuous(labels = comma)+
scale_x_date(date_labels = "%Y", breaks = "1 year")+
theme(axis.text.x = element_text(size = 10, angle = 90))+
labs(x= "Year", y= "Median House Prices")+
geom_point(color="blue", size = 3)

#as_tsibble convert
data3group_ts <- data3group %>%
mutate(Year = year(as.character(Year))) %>%
as_tsibble(index = Year)

#plotting
data3group_ts %>%
autoplot(medPrice) +
labs(title =  "X",
y= "Y")

#Time plot and ACF and PACF plots
data3group_ts %>%
gg_tsdisplay(difference(medPrice), plot_type = 'partial')

#Use auto.arima and specify if the series has a mean=0 or not

auto.arima(auto.arima(data3group_ts, allowmean=FALSE, allowdrift=FALSE, trace=TRUE), allowmean=FALSE, allowdrift=FALSE, trace=TRUE)

#########Response;
ARIMA(2,1,2)                    : 744.7771
ARIMA(0,1,0)                    : 737.1418
ARIMA(1,1,0)                    : 736.0715
ARIMA(0,1,1)                    : 735.3772
ARIMA(1,1,1)                    : 738.17
ARIMA(0,1,2)                    : 738.1905
ARIMA(1,1,2)                    : 741.2703

Best model: ARIMA(0,1,1)  
##########

#ARIMA fitting
Data_fit <- data3group_ts %>%
model(arima011 = ARIMA(medPrice \~ pdq (0,1,1)),
stepwise = ARIMA(medPrice),
search = ARIMA (medPrice, stepwise = FALSE))

glance(Data_fit) %>% arrange(AICc) %>% select(.model:BIC)

###############Response;

# A tibble: 3 × 6

.model    sigma2 log_lik   AIC  AICc   BIC
<chr>      <dbl>   <dbl>   <dbl> <dbl> <dbl>
1 arima011 2.97e15   -365.  735.  735.  737.
2 stepwise 2.97e15   -365.  735.  735.  737.
3 search   2.97e15   -365.  735.  735.  737.

##################

#residuals
Data_fit %>%
select(search) %>%
gg_tsresiduals()

augment(Data_fit) %>%
filter(.model=='search') %>%
features(.innov, ljung_box, lag = 10, dof = 3)

#############Response;

# A tibble: 1 × 3

.model lb_stat lb_pvalue
<chr>    <dbl>     <dbl>
1 search    3.73     0.811
######################

#FORECASTING

Data_fit %>%
forecast(h=5) %>%
filter(.model=='search') %>%
autoplot(data3group_ts)

#Returns error as mentioned above.

I tried using other forecasting codes but didn't work.Same result.


Solution

  • library(fpp3)
    library(tidyverse)
    
    data <- tibble(souvenirs) |>
      rename(Date = Month, sellPrice = Sales)
    
    data <- data |>
      select(Date, sellPrice) |>
      group_by(Date) |>
      summarise(medPrice = median(sellPrice, na.rm = FALSE)) |>
      mutate(Date = as_date(Date)) |>
      mutate(Year = cut(Date, breaks = 'year')) |>
      group_by(Year) |>
      summarise(sum(medPrice)) |>
      select(Year, "medPrice" = "sum(medPrice)") |>
      mutate(Year = year(as.character(Year))) |>
      as_tsibble(index = Year)
    
    # use fable method of auto arima
    Data_fit <- data |>
      model(ARIMA(medPrice ~ Year, trace = TRUE))
    
    Data_fit |>
      forecast(h = 5) |>
      autoplot(data)
    

    I had a go using the fpp3 data set "souvenirs", using your code. The first data pipe converts souvenirs to how I guessed your data frame looks. The second data pipe is your preprocessing commands, just organised to pipe, with no logic changes. The data_fit pipe is a rewrite of your auto.arima code using the fable method. There were a couple of issues with data_fit code, a backslash that seemed an error, and the auto.arima code was nested within an auto.arima call. The auto.arima method is from the forecast library. With this data_fit pipe, you don't need a filter. Filtering on forecast as you did is possible: The forecast object is of class fable and inherits from data frame. This code works with the souvenir data. Hope this helps Ezgi :-)