pythonnumpynumba

Why is this trivial numba function with a List input so slow


import numba
from typing import List

@numba.njit
def test(a: List[int]) -> int:
    return 1

test([i for i in range(2_000_000)])

takes 2s and scales linearly with the size of the list. Wrapping the input argument with numba.typed.List takes even longer. (all the time is spent on the numba.typed.List call.

The timings don't get better if the function is called multiple times (while only being defined once), i.e., this is not a matter of compilation time.

Is there a way to tell numba to just use the list as is?

In my actual application, the raw data comes from an external library that cannot return numpy arrays or numba lists directly, only Python lists.

I'm using numba 0.59.1 and Python 3.12 on a 4 core Ubuntu22 laptop with 16GB RAM.


Solution

  • Numba only operates on typed variables. It needs not only to check the types of all the items but also convert the whole list into a typed list. This implicit conversion can be particularly expensive since CPython lists, a.k.a. reflected lists contain pointers on allocated objects and each objects is reference counted. Typed lists of Numba are homogeneous and they do not contain references but directly the value in it. This is far more efficient and similar to a Numpy array with additional features like resizing.

    Is there a way to tell numba to just use the list as is?

    AFAIK, no. Reflected lists are not supported any-more in Numba code. Operating on reflected lists is not only inefficient, but also break the type system.

    The best option is to create directly a typed list. Here is an example:

    import numba as nb
    
    # Quite fast part (less than 0.1 seconds)
    reflected_lst = [i for i in range(2_000_000)]
    
    # Slow part (3 seconds)
    typed_lst = nb.typed.typedlist.List(reflected_lst)
    
    # Very fast part (less than 2 µs) since `lst` is already a typed-list
    test(typed_lst)
    

    As mentioned by @roganjosh, note the list generation is included in your benchmark, but this only takes a tiny fraction of the execution time (<5% on my machine).

    Note the conversion process is particularly expensive in Numba (as opposed to Numpy, see the below comments). There is an opened issue on this topic. To quote the developers:

    [...] we came to the conclusion that there is probably room for improvement.
    [...] this problem is all down to implementation details.

    As of now, the issue is still opened and there is a patch to improve a bit the performance of the conversion, but it is still rather slow with it.