I am using the following code to find the RSI (Relative Strength Index) and DEMA (double exponential moving average) of a stock.
library(quantmod)
library(TTR)
getSymbols("AAPL")
chartSeries(AAPL, TA=NULL)
data=AAPL[,4]
AAPL$rsi = TTR::RSI(data)
AAPL$dema = TTR::DEMA(data)
# object B stores the copy of AAPL object and I save it in a CSV file
B = AAPL
Every day, object AAPL
will have a new line to reflect data of the last closing day.
Each day RSI
and DEMA
functions run on the entire dataset. It seems that it is a wastage of CPU power and time to run RSI
again and again on the last 12+ years data, even though only one new row (for the last trading day) is added to the data.
Is there a way to find RSI
, DEMA
, etc... of only the last day in AAPL
object and add it to the old dataset B
?
I wonder how quant traders might be doing this kind of operation when they get tick data each second and they need to find RSI and few other indicators on new and all the past data. Even with the fastest computer, it will take several minutes to get the indicator data, and the market would have moved by then.
Thanks!
Let's say that yesterday you downloaded all of the relevant data and calculated all of the RSI and DEMA statistics. Below are the data up until March 2, 2021.
library(quantmod)
library(TTR)
getSymbols("AAPL")
chartSeries(AAPL, TA=NULL)
AAPL <- AAPL[, ]
data=AAPL[,4]
AAPL$rsi = TTR::RSI(data)
AAPL$dema = TTR::DEMA(data)
# AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted rsi dema
# 2021-02-23 123.76 126.71 118.39 125.86 158273000 125.86 35.08898 127.7444
# 2021-02-24 124.94 125.56 122.23 125.35 111039900 125.35 34.28019 126.5275
# 2021-02-25 124.68 126.46 120.54 120.99 148199500 120.99 28.27909 124.2326
# 2021-02-26 122.59 124.85 121.20 121.26 164320000 121.26 29.10677 122.6783
# 2021-03-01 123.75 127.93 122.79 127.79 115998300 127.79 45.49055 123.7497
# 2021-03-02 128.41 128.72 125.01 125.12 102015300 125.12 41.28885 123.7178
Then, you save this result to a CSV:
write_csv(as.data.frame(AAPL), "aapl.csv")
Now, today you downloaded the data and you've got one new data point. By using the last 200 days numbers, you could generate the same value for the most recent day as using the whole data set. This seems to work for other symbols, too, but you'd want to make sure it generalizes.
getSymbols("AAPL")
data=AAPL[(nrow(AAPL)-200):nrow(AAPL),4]
AAPL$rsi = TTR::RSI(data)
AAPL$dema = TTR::DEMA(data)
tail(AAPL)
# AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted rsi dema
# 2021-02-24 124.94 125.56 122.23 125.35 111039900 125.35 34.28019 126.5275
# 2021-02-25 124.68 126.46 120.54 120.99 148199500 120.99 28.27909 124.2326
# 2021-02-26 122.59 124.85 121.20 121.26 164320000 121.26 29.10677 122.6783
# 2021-03-01 123.75 127.93 122.79 127.79 115998300 127.79 45.49055 123.7497
# 2021-03-02 128.41 128.72 125.01 125.12 102015300 125.12 41.28885 123.7178
# 2021-03-03 124.81 125.71 121.84 122.06 112430400 122.06 37.06365 122.7313
You could then take this last row and append it to the previous CSV as @phiver suggested:
write_csv(as.data.frame(AAPL)[nrow(AAPL), ], "aapl.csv", append=TRUE)
The real question is what's to be gained from such a procedure? Looking at the benchmarks for the two different procedures, using the median estimates, executing the RSI operation on the full data is almost 40% slower, though it will not be noticeable if you're doing only a few calls. I didn't print the results here, but the DEMA routine is about 30% slower on the full data set. If you had to do this thousands of times per day, doing it like this might make sense, but if you had to do it 10 times per day, it may not be worth the trouble.
library(microbenchmark)
microbenchmark(TTR::RSI(AAPL[,4]), times=1000)
# Unit: microseconds
# expr min lq mean median uq max neval
# TTR::RSI(AAPL[, 4]) 797.03 823.431 1008.936 852.5145 924.193 18113.29 1000
microbenchmark(TTR::RSI(AAPL[(nrow(AAPL)-200):nrow(AAPL),4]), times=1000)
# Unit: microseconds
# expr min lq mean median uq max neval
# TTR::RSI(AAPL[(nrow(AAPL) - 200):nrow(AAPL), 4]) 634.306 652.424 710.9095 671.79 706.294 11743.02 1000