matlabperformancefor-loopmultidimensional-arrayirr

3-dimensional IRR in Matlab 2019


I am trying to calculate an IRR with several dimensions in Matlab 2019a. My formula works in theory (ignoring the "multiple rates of return" warning for now), but the problem is that for bigger matrices, i.e. noScenarios > 5 or so, the code gets very slow. What are programming alternatives for this? I tried also fsolve but I think it's not faster, either.

Please note that as I'm no math crack, a simple key word like "Brent's method" does not suffice for me (e.g. as in What is the Most Efficient Way to Calculate the Internal Rate of Return IRR?). I would have to know a) how to implement it in Matlab, and b) if it is pretty idiot proof so that nothing can go wrong? Thank you!

clc
clear
close all

noScenarios = 50;

CF = ones(300,noScenarios,noScenarios,noScenarios);
CF = [repmat(-300, 1,noScenarios,noScenarios,noScenarios); CF];

for scenarios1 = 1:noScenarios
    for scenarios2 = 1:noScenarios
        for scenarios3 = 1:noScenarios
            IRR3dimensional(scenarios1,scenarios2,scenarios3) = irr(CF(:,scenarios1,scenarios2,scenarios3));
        end
    end
end

Solution

  • To calculate IRR, you need to solve a polynomial equation. This has to be done for each cash flow vector separately. Hence, applying irr to a multidimensional matrix does not improve execution time. I suspect that Matlab still uses a loop internally.

    You might be able to gain some speed by playing with optimization options of fsolve but a large improvement is very unlikely. Presumably, Matlab developers already chose a sufficiently good approach.

    Thus, your only other alternative is parallelization. If you have access to a server or your laptop/desktop has multiple CPUs, you can reduce your run time by running irr functions in parallel. (You also probably need a Parallel Computing Toolbox.)

    I modified your example slightly to use random cash flow values to make it easier to check. However, I reduced the number of scenarios and time points, so that the timeit function could run multiple simulations in a reasonable time. (Also, please keep in mind that the execution time seems to be exponential in the number of time points.)

    t = 150;
    noScenarios = 10;
    noThreads = 4;
    
    CF = rand(t,noScenarios,noScenarios,noScenarios);
    CF = [-rand(1,noScenarios,noScenarios,noScenarios); CF];
    
    h1 = @() f1(CF, noScenarios);
    fprintf("%0.4f : single thread, loop\n", timeit(h1))
    
    h2 = @() f2(CF, noScenarios);
    fprintf("%0.4f : single thread, vectorized\n", timeit(h2))
    
    poolObj = parpool('local', noThreads);
    h3 = @() f3(CF, noScenarios);
    fprintf("%0.4f : parallelized outer loop\n", timeit(h3))
    delete(poolObj);
    
    poolObj = parpool('local', noThreads);
    h4 = @() f4(CF, noScenarios);
    fprintf("%0.4f : parallelized inner loop\n", timeit(h4))
    delete(poolObj);
    
    function res = f1(CF, noScenarios)
        res = zeros(noScenarios, noScenarios, noScenarios);
        for scenarios1 = 1:noScenarios
            for scenarios2 = 1:noScenarios
                for scenarios3 = 1:noScenarios
                    res(scenarios1,scenarios2,scenarios3) = irr(CF(:,scenarios1,scenarios2,scenarios3));
                end
            end
        end
    end
    
    function res = f2(CF, noScenarios)
        res = reshape(irr(CF), noScenarios, noScenarios, noScenarios);
    end
    
    function res = f3(CF, noScenarios)
        res = zeros(noScenarios, noScenarios, noScenarios);
        parfor scenarios1 = 1:noScenarios
            for scenarios2 = 1:noScenarios
                for scenarios3 = 1:noScenarios
                    res(scenarios1,scenarios2,scenarios3) = irr(CF(:,scenarios1,scenarios2,scenarios3));
                end
            end
        end
    end
    
    function res = f4(CF, noScenarios)
        res = zeros(noScenarios, noScenarios, noScenarios);
        for scenarios1 = 1:noScenarios
            for scenarios2 = 1:noScenarios
                parfor scenarios3 = 1:noScenarios
                    res(scenarios1,scenarios2,scenarios3) = irr(CF(:,scenarios1,scenarios2,scenarios3));
                end
            end
        end
    end
    

    When I ran this code on a server with 4 CPUs and 16 Gb of memory, I got the following results.

    19.9357 : single thread, loop
    20.4318 : single thread, vectorized
    ...
    5.6346 : parallelized outer loop
    ...
    12.4640 : parallelized inner loop
    

    As you can see, the vectorized version of irr provides no benefits over the loop. In this case, it is slightly slower. In my other tests, it was occasionally a bit faster.

    However, you can significantly reduce your run time by parallelizing your outer loop with the parfor function. It is better than parallelizing the inner-most loop because each batch has a certain execution overhead. So, a small number of larger batches have lower overhead than a large number of smaller batches.

    Here is how the parallelization works. First, you create a pool of local worker threads with the command below. Make sure you do not exceed the number of CPUs that you have. parpool can wait indefinitely until all local workers are created and it can only create a local worker if a CPU is available.

    poolObj = parpool('local', noThreads);
    

    The pool creation may take a few seconds. That is why I moved it outside of the function I timed. For larger jobs, pool create time is insignificant compared to the total execution time.

    Here, I save the pool object in a variable and delete it afterwards. However, it is optional. The pool is destroyed by default after 30 minutes of inactivity or when Matlab terminates.

    After that, you replace a for loop you want to parallelize with a parfor call, i.e. for scenarios1 = 1:noScenarios becomes parfor scenarios1 = 1:noScenarios. By default, parfor will use all available workers but you can also specify the maximum number of workers it is allowed to use with parfor (scenarios1 = 1:noScenarios, maxWorkers). Note, however, that the execution order is not guaranteed, i.e. the fifth iteration may be executed before the third iteration.