Given a simple vector of 82 observation
x = c(102, 104, 89, 89, 76, 95, 88, 112, 81, 101, 101, 104, 94, 111, 108, 104, 93, 92, 86, 113, 93, 100, 92, 80, 92, 126, 102, 109, 104, 95, 84, 81, 103, 83, 103, 83, 58, 109, 89, 93, 104, 104, 123, 104, 93, 76, 103, 103, 100, 105, 108, 90, 122, 103, 114, 102, 87, 98, 88, 107, 102, 80, 81, 96, 107, 105, 113, 98, 93, 104, 94, 107, 107, 97, 102, 82, 90, 97, 124, 109, 96, 92)
I would like to perform EMA (Exponential Moving Average) of this vector in such a way:
The 1st element of the new vector should be NA
The 2nd element should be the first element of the original vector
The 3rd element should be the EMA of the first and second elements of the original vector
The 4th element should be the EMA of the first three elements of the original vector
...
The 82th element should be the EMA of all the values of the original vector except for the last one
The ideas are to to place greater weight on the most recent vector elements and that also the last elements of the new vector is affected (although infinitesimally) by the first elements of the orginal vector.
I tried achieving this using the function EMA
from the package TTR
and lag
from dplyr
> library(dplyr)
> library(TTR)
> lag(EMA(x, 1, ratio = 2/(81+1)))
[1] NA 102.00000 102.04878 101.73052 101.42002 100.80002 100.65855 100.34981 100.63396 100.15508 100.17569
[12] 100.19579 100.28858 100.13520 100.40020 100.58556 100.66884 100.48179 100.27492 99.92675 100.24561 100.06889
[23] 100.06721 99.87045 99.38580 99.20566 99.85918 99.91139 100.13307 100.22738 100.09989 99.70721 99.25093
[34] 99.34237 98.94378 99.04271 98.65143 97.65993 97.93651 97.71855 97.60346 97.75948 97.91168 98.52360
[45] 98.65717 98.51919 97.96994 98.09262 98.21231 98.25592 98.42041 98.65405 98.44298 99.01754 99.11468
[56] 99.47773 99.53925 99.23342 99.20333 98.93008 99.12691 99.19698 98.72876 98.29635 98.24035 98.45400
[67] 98.61365 98.96454 98.94102 98.79611 98.92304 98.80296 99.00289 99.19794 99.14433 99.21398 98.79413
[78] 98.57964 98.54111 99.16206 99.40201 99.31903
But that's definetely not the result I was looking for.... what I'm doing wrong?
I wasn't able to find any comprehensive documentation about ratio
argument on internet, and I'm not sure I've got this clear.
Can anyone please help me?
To get things more clear: the result i reached until now is the following:
> library(runner)
> mean_run(x, k = 7, lag = 1)
[1] NA 102.00000 103.00000 98.33333 96.00000 92.00000 92.50000 91.85714 93.28571 90.00000 91.71429
[12] 93.42857 97.42857 97.28571 100.57143 100.00000 103.28571 102.14286 100.85714 98.28571 101.00000 98.42857
[23] 97.28571 95.57143 93.71429 93.71429 99.42857 97.85714 100.14286 100.71429 101.14286 101.71429 100.14286
[34] 96.85714 94.14286 93.28571 90.28571 85.00000 88.57143 89.71429 88.28571 91.28571 91.42857 97.14286
[45] 103.71429 101.42857 99.57143 101.00000 100.85714 100.28571 97.71429 98.28571 97.85714 104.42857 104.42857
[56] 106.00000 106.28571 103.71429 102.28571 102.00000 99.85714 99.71429 94.85714 91.85714 93.14286 94.42857
[67] 96.85714 97.71429 97.14286 99.00000 102.28571 102.00000 102.00000 102.28571 100.00000 100.57143 99.00000
[78] 97.00000 97.42857 99.85714 100.14286 100.00000
So this is the Simple Moving Average (SMA) over k=7
observations i obteneid with mean_run
function from runner
package.
Now i would like to "improve" this moving average by placing exponential increasing weights on each observations and making sure that also the last element is affected by the first one (the weight for that observation should be as close as possible to 0). That means that the window sizes for the rolling average will be:
n=0
for the 1st element (i.e. NA
)
n=1
for the 2nd element (i.e the 1st element of the original vector)
n=2
for the 3d element (i.e. the EMA of the 1st and 2nd elements)
n=3
for the 4th element (i.e. the EMA of the 1st,2nd and 3rd elements)
...
n=81
for the 82th element (i.e. the EMA of the first 81 elements)
I still wasn't able to find any good documentation about the ratio
arguments (i.e alpha), but i think it should be settled aribitrarily, but I'm not sure about that
Assuming that you intended what you wrote, i.e. a lagged exponential moving average and not a lagged weighted moving average defined in a comment, we define the iteration in iter and then use Reduce like this.
alfa <- 2/(81+1)
iter <- function(y, x) alfa * x + (1-alfa) * y
ema <- c(NA, head(Reduce(iter, tail(x, -1), init = x[1], acc = TRUE), -1))
# check
identical(ema[1], NA_real_)
## [1] TRUE
identical(ema[2], x[1])
## [1] TRUE
identical(ema[3], alfa * x[2] + (1-alfa) * x[1])
## [1] TRUE
identical(ema[4], alfa * x[3] + (1-alfa) * ema[3])
## [1] TRUE
The lagged weighted moving average in the comment is not an exponential moving average and is unlikely what you want but just to show how to implement it if the second argument to rollapply is a list containing a vector then that vector will be regarded as the offsets to use.
library(zoo)
c(NA, x[1], rollapplyr(x, list(-seq(2)), weighted.mean, c(alfa, 1-alfa)))