I have a time series of returns. In the first column I have all my dates. With this function (searching for hours) I finally managed to get the first value of every month.
The problem was, that not always day 1 is the first value. Sometimes it is day 2,3,4,5 etc.
Luckily with this function it all works:
library(xts)
month<- data[,1] ## all my dates
first.values <- month[head(endpoints(month, "months") + 1, -1)]
What I want to understand though: Why the +1 and -1? That is all I would like to understand.
I am not satisfied that the code works, I really want to understand. Unfortunately I do not have enough reputation to contact or comment someone (since I found this code here).
Let's prepare sample data :
month <- seq.Date(from=Sys.Date()-5,to=Sys.Date()+10,by="day")
# [1] "2018-06-18" "2018-06-19" "2018-06-20" "2018-06-21" "2018-06-22" "2018-06-23" "2018-06-24" "2018-06-25" "2018-06-26"
# [10] "2018-06-27" "2018-06-28" "2018-06-29" "2018-06-30" "2018-07-01" "2018-07-02" "2018-07-03"
xts::endpoints
gives the index of the last observation of each month, always beginning with 0:
library(xts)
endpoints(month, "months")
# [1] 0 13 16
So if you add 1
, you'll have the index of the first available day of the next month and conveniently, the 0 will be the index of the 1st day of the 1st month:
endpoints(month, "months") + 1
# [1] 1 14 17
The last value is meaningless though, so we drop it:
head(endpoints(month, "months") + 1, -1)
# [1] 1 14
And we end up with your solution:
first.values <- month[head(endpoints(month, "months") + 1, -1)]
# [1] "2018-06-18" "2018-07-01"
An alternate way of doing it:
month <- as.xts(month)
first_as_list <- lapply(split(month,f="month"), function(x) index(x)[1])
do.call(c,first_as_list)
# [1] "2018-06-18" "2018-07-01"