rvectordplyrmoving-averagettr

Lagged exponential moving average of a vector


Given a simple vector of 82 observation

x = c(102, 104, 89, 89, 76, 95, 88, 112, 81, 101, 101, 104, 94, 111, 108, 104, 93, 92, 86, 113, 93, 100, 92, 80, 92, 126, 102, 109, 104, 95, 84, 81, 103, 83, 103, 83, 58, 109, 89, 93, 104, 104, 123, 104, 93, 76, 103, 103, 100, 105, 108, 90, 122, 103, 114, 102, 87, 98, 88, 107, 102, 80, 81, 96, 107, 105, 113, 98, 93, 104, 94, 107, 107, 97, 102, 82, 90, 97, 124, 109, 96, 92)

I would like to perform EMA (Exponential Moving Average) of this vector in such a way:

The ideas are to to place greater weight on the most recent vector elements and that also the last elements of the new vector is affected (although infinitesimally) by the first elements of the orginal vector.

I tried achieving this using the function EMA from the package TTR and lag from dplyr

> library(dplyr)
> library(TTR)
> lag(EMA(x, 1, ratio = 2/(81+1)))

 [1]        NA 102.00000 102.04878 101.73052 101.42002 100.80002 100.65855 100.34981 100.63396 100.15508 100.17569
[12] 100.19579 100.28858 100.13520 100.40020 100.58556 100.66884 100.48179 100.27492  99.92675 100.24561 100.06889
[23] 100.06721  99.87045  99.38580  99.20566  99.85918  99.91139 100.13307 100.22738 100.09989  99.70721  99.25093
[34]  99.34237  98.94378  99.04271  98.65143  97.65993  97.93651  97.71855  97.60346  97.75948  97.91168  98.52360
[45]  98.65717  98.51919  97.96994  98.09262  98.21231  98.25592  98.42041  98.65405  98.44298  99.01754  99.11468
[56]  99.47773  99.53925  99.23342  99.20333  98.93008  99.12691  99.19698  98.72876  98.29635  98.24035  98.45400
[67]  98.61365  98.96454  98.94102  98.79611  98.92304  98.80296  99.00289  99.19794  99.14433  99.21398  98.79413
[78]  98.57964  98.54111  99.16206  99.40201  99.31903

But that's definetely not the result I was looking for.... what I'm doing wrong? I wasn't able to find any comprehensive documentation about ratio argument on internet, and I'm not sure I've got this clear. Can anyone please help me?

To get things more clear: the result i reached until now is the following:

> library(runner)
> mean_run(x, k = 7, lag = 1)

 [1]        NA 102.00000 103.00000  98.33333  96.00000  92.00000  92.50000  91.85714  93.28571  90.00000  91.71429
[12]  93.42857  97.42857  97.28571 100.57143 100.00000 103.28571 102.14286 100.85714  98.28571 101.00000  98.42857
[23]  97.28571  95.57143  93.71429  93.71429  99.42857  97.85714 100.14286 100.71429 101.14286 101.71429 100.14286
[34]  96.85714  94.14286  93.28571  90.28571  85.00000  88.57143  89.71429  88.28571  91.28571  91.42857  97.14286
[45] 103.71429 101.42857  99.57143 101.00000 100.85714 100.28571  97.71429  98.28571  97.85714 104.42857 104.42857
[56] 106.00000 106.28571 103.71429 102.28571 102.00000  99.85714  99.71429  94.85714  91.85714  93.14286  94.42857
[67]  96.85714  97.71429  97.14286  99.00000 102.28571 102.00000 102.00000 102.28571 100.00000 100.57143  99.00000
[78]  97.00000  97.42857  99.85714 100.14286 100.00000

So this is the Simple Moving Average (SMA) over k=7 observations i obteneid with mean_run function from runner package. Now i would like to "improve" this moving average by placing exponential increasing weights on each observations and making sure that also the last element is affected by the first one (the weight for that observation should be as close as possible to 0). That means that the window sizes for the rolling average will be:

I still wasn't able to find any good documentation about the ratio arguments (i.e alpha), but i think it should be settled aribitrarily, but I'm not sure about that


Solution

  • Assuming that you intended what you wrote, i.e. a lagged exponential moving average and not a lagged weighted moving average defined in a comment, we define the iteration in iter and then use Reduce like this.

    alfa <- 2/(81+1)
    iter <- function(y, x) alfa * x + (1-alfa) * y
    ema <- c(NA, head(Reduce(iter, tail(x, -1), init = x[1], acc = TRUE), -1))
    
    # check
    
    identical(ema[1], NA_real_)
    ## [1] TRUE
    identical(ema[2], x[1])
    ## [1] TRUE
    identical(ema[3], alfa * x[2] + (1-alfa) * x[1])
    ## [1] TRUE
    identical(ema[4], alfa * x[3] + (1-alfa) * ema[3])
    ## [1] TRUE
    

    The lagged weighted moving average in the comment is not an exponential moving average and is unlikely what you want but just to show how to implement it if the second argument to rollapply is a list containing a vector then that vector will be regarded as the offsets to use.

    library(zoo)
    c(NA, x[1], rollapplyr(x, list(-seq(2)), weighted.mean, c(alfa, 1-alfa)))