rtime-seriesarima

basic ARIMA set up in stock price forecasting using R


I am using auto.arima function as the backbone to forecast stock price, with example below:

First off I have the parameters set up and download price data, with Walmart(WMT) being used as an example.

library(quantmod)
library(forecast)
ticker<-"WMT"
start_date<-"2018-04-01"
end_date<-"2024-07-30"

x<-getSymbols(ticker,from=start_date,to=end_date,auto.assign=FALSE)
x<-x[,6] # to extract the adjusted price

plot(as.vector(x[1:length(x)-1]),
     as.vector(x[2:length(x)]),main="Plotting WMT’s stock price against itself with a lag of 1")

The plot below shows a pretty strong linear trend which supports an auto regressive relationship.

enter image description here

Then return data on X is calculated and plotted.

x.return<-diff(as.vector(x))/x[1:length(x)-1]
plot(x.return, type="l",main="return")

enter image description here

To run an adf test to ensure stationarity.

adf.test(x.return)

    Augmented Dickey-Fuller Test

data:  x.return
Dickey-Fuller = -11.745, Lag order = 11, p-value = 0.01
alternative hypothesis: stationary

To create the in-sample data set and out-of-sample period.

numbers.of.days<-length(x.return)
days.out.of.sample=30
in.sample<-x.return[1:(numbers.of.days-days.out.of.sample)]

To use auto.arima function to do the heavy lifting.

> model=auto.arima(in.sample)
> model
Series: in.sample 
ARIMA(1,0,0) with non-zero mean 

Coefficients:
          ar1   mean
      -0.0801  7e-04
s.e.   0.0252  3e-04

sigma^2 = 0.0001843:  log likelihood = 4506.08
AIC=-9006.16   AICc=-9006.14   BIC=-8990.09

To forecast with the model built with auto.arima function and plot:

futures_returns=forecast(model,
                         h=days.out.of.sample,
                         level=c(99))

plot(forecast(futures_returns))

enter image description here

I myself always find this kind of plot somehow confusing to look at, converting it to price return seems to be a bit better.

x.vec=as.vector(x)
stock.prices.in.sample=
  x.vec[1:(length(x.vec)-days.out.of.sample)]

stock.prices.forecasted=
  stock.prices.in.sample[length(stock.prices.in.sample)]*compound_forecasts

forecasted=
  c(x.vec[100:(length(x.vec)-days.out.of.sample)],
    stock.prices.forecasted)
plot(forecasted,
     type="l")
lines(x.vec[100:length(x.vec)],col="red")

enter image description here

Finally, I extract the max and min from the last 20 values, which are the foretasted values, so pretty much in the coming 20 days, it would be relatively safe to buy at 67.75 (min value from the predicted series) and sell at 68.67 (max value from the predicted series).

> forecast_values<-tail(forecasted,20)
> max(forecast_values)
[1] 68.67026
> min(forecast_values)
[1] 67.74756

I totally understand the above set up is very very basic and I would love to seek ideas or comments from you all and see how can I make it better? Many thanks for your help.

Question: I know there can be a GARCH model fitting into the data set, is there a similar function to auto.arima so that I can incorporate GARCH model into it?


Solution

  • Here are the next few lines of code after the getSymbols line redone.

    x.Ad <- Ad(x)  # adjusted close
    plot(c(NA, x.Ad, NA) ~ c(NA, NA, x.Ad), xlab = "Lag1", ylab = "Adjusted close",
      main = ticker)  # put NA 1st to make numeric vectors
    
    x.return <- dailyReturn(x.Ad)
    plot(x.return, type = "l", main = ticker, ylab = "Return")
    
    n.oos <- 30 # no out of sample
    in.sample <- head(x.return, -n.oos)  # note negative value