rfor-loopapplylapplyxirr

Converting for() loops to apply() functions in R for a custom XIRR function


I've currently been looking for functions to calculate the XIRR of a large database of cashflows/balances in R, and I've come across this function, which I've been trying to modify to fit my code:

library(tidyverse)

xirr2 <- function(exflow, date) {
    if(as.numeric(max(date) - min(date)) <= 2) {
        return(0)
    } else if(abs(sum(exflow, na.rm = TRUE)) < 1e-12) {
        return(0)
    } else {
        npv <- function(range, exflow, date){
            for(test.rate in range) {
                temp <- as.data.frame(cbind(exflow, date)) %>%
                    mutate(npv = exflow * ((1 + test.rate/100)^(as.numeric(max(date) - date)/365))) %>%
                    select(npv) %>%
                    .[1]
                if(sum(exflow, na.rm = TRUE) > 0) {
                    if(sum(temp, na.rm = TRUE) > 0) {
                        min.rate <- test.rate
                        next
                    } else {
                        max.rate <- test.rate
                        break
                    }
                } else {
                    if(sum(temp, na.rm = TRUE) < 0) {
                        min.rate <- test.rate
                        next
                    } else {
                        max.rate <- test.rate
                        break
                    }
                }
            }
            return(list(min.rate = min.rate, max.rate = max.rate))
        }
        max.rate <- c()
        min.rate <- c()
        if(sum(exflow, na.rm = TRUE) >= 1e-12) {
            range <- seq(from = 0, to = 1e8, by = 1e3)    
            hundreds <- npv(range, exflow, date)
            range <- seq(from = hundreds$min.rate, to = hundreds$max.rate, by = 10)
            tens <- npv(range, exflow, date)
            range <- seq(from = tens$min.rate, to = tens$max.rate, by = 1)
            ones <- npv(range, exflow, date)
            range <- seq(from = ones$min.rate, to = ones$max.rate, by = 0.01)
            decimals <- npv(range, exflow, date)
            return(mean(unlist(decimals))/100)
        } else {
            range <- seq(from = 0, to = -1e8, by = -1e3)
            hundreds <- npv(range, exflow, date)
            range <- seq(from = hundreds$min.rate, to = hundreds$max.rate, by = -10)
            tens <- npv(range, exflow, date)
            range <- seq(from = tens$min.rate, to = tens$max.rate, by = -1)
            ones <- npv(range, exflow, date)
            range <- seq(from = ones$min.rate, to = ones$max.rate, by = -0.01)
            decimals <- npv(range, exflow, date)
            return(mean(unlist(decimals))/100) 
        }
    }
}

Basically, given a vector of cashflows and a vector of corresponding dates, this function returns the annualized XIRR of a investment.

While it works great and generates consistently correct answers when cross-referenced with MS Excel and LibreOffice Calc, it is a bit on the slower side, and I feel that it could be improved by replacing the for() loop with an apply() function or something from the data.table package. The speed issues are barely noticeable on small examples, but on large datasets like mine with a ton of edge cases, the slow-down can be pretty substantial.

For what it's worth, I've tried a multitude of other XIRR functions from various packages, including tvm, FinancialMath, and FinCal. For whatever reason, these functions tend to break down over time: solutions eventually stop converging and become inaccurate, particularly with large cashflows and rapid changes between positive/negative returns. This might possibly be due to a common reliance on the uniroot() or polyroot() functions in R to calculate XIRR, but I'm not sure.

In any case, the above function actually gets me the numbers I want---I just need some help optimizing it for larger datasets. Thank you in advance!

EDIT

Thank you for the help so far. Here are some minimum examples:

Some deposits, some withdrawals, and then a complete withdrawal for a positive return. MS Excel shows XIRR = 15.32%:

> flow1 <- c(-1000,-100,100,1200)
> date1 <- as.Date(c("2018-01-01","2018-10-31","2019-03-31","2019-03-31"), format = "%Y-%m-%d")
> tvm::xirr(flow1,date1)
Error in uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau,  : 
  f.lower = f(lower) is NA
> xirr2(flow1,date1)
[1] 0.15315

An account receiving regular contributions with a poor return. MS Excel shows XIRR = -27.54%:

> flow2 <- c(-200,-200,-200,-200,-200,800)
> date2 <- as.Date(c("2018-01-01","2018-03-01","2018-06-01","2018-09-01","2019-01-01","2019-03-01"), format = "%Y-%m-%d")
> tvm::xirr(flow2,date2)
Error in uniroot(xnpv, interval = interval, cf = cf, d = d, tau = tau,  : 
  f.lower = f(lower) is NA
> xirr2(flow2,date2)
[1] -0.27535

Maybe I'm just using tvm::xirr() wrong? I'm not sure how to correct that uniroot() error.


Solution

  • OK, I figured it out thanks to this answer. Turns out that tvm::xirr() does work well (and is significantly faster than the above function), but I had been calling it incorrectly. Here is a working example:

    > flow2 <- c(-200,-200,-200,-200,-200,800)
    > date2 <- as.Date(c("2018-01-01","2018-03-01","2018-06-01","2018-09-01","2019-01-01","2019-03-01"), format = "%Y-%m-%d")
    > tvm::xirr(flow2, date2, comp_freq = 1, maxiter = 100, tol = 1e-8, lower = -0.999, upper = 100)
    [1] -0.2753857
    

    I had tried this earlier with lower = -1, which gives the same error as above. So I was off by 0.001... so close. Thank you all again for your help!