This one is driving me nuts. I have a large data.table with monthly stock data. Every June I assign every stock to one of 10 portfolios based on an accounting variable. I would like to carry forward the assigned portfolio variable to the next 11 month until each stock gets assigned to a new portfolio 1 to 10 in June next year. na.locf
is basically what I'm looking for but I am running into 2 issues:
na.locf
keeps carrying forward the portfolio number until there is a new one. That's why I looking for a code that carries forward the last observation a maxium of 11 times until June next year (when there is a new portfolio number).
That's the na.locf
solution right now with the 2 issues (PERMNO is the stock identifier):
COMPUSTAT_CRSP_IBES1[,
Portfolio_Monthly := na.locf(Portfolio_Monthly,
na.rm = FALSE),
by = PERMNO]
I tried to use rep
but that didn't work at all:
COMPUSTAT_CRSP_IBES1[,
Portfolio_Monthly := if_else(!is.na(Portfolio_Monthly),
rep(Portfolio_Monthly, 11),
NA),
by = PERMNO]
Thank's for any hints!
You can create and/or use your fiscal year (June - May) as one of the group by
criteria in your na.locf
solution
#show data before calculations
data.frame(dat)
#demo FY calculation
dat[, FY := year(MONTH) + as.numeric(month(MONTH) >= 6)]
#actual code
dat[, Portfolio_Monthly := zoo::na.locf(Portfolio_Monthly, na.rm=FALSE),
by=list(PERMNO, year(MONTH) + as.numeric(month(MONTH) >= 6))]
#show results
data.frame(dat)
library(data.table)
set.seed(0L)
dat <- data.table(PERMNO=rep(LETTERS[1:12], each=20),
MONTH=rep(seq(as.Date("2000-01-01"), by="1 month", length.out=20), 12),
Portfolio_Monthly=NA_real_)
for (i in sample(1:dat[,.N], 5)) {
set(dat, i, 3L, rnorm(1))
}
setorder(dat, PERMNO, MONTH)