arraysperformancejuliaallocation

Unexpected memory allocation when using array views (julia)


I'm trying to search for the desired pattern (variable template) in the array X. The length of the template is 9.

I'm doing something like:

function check_alloc{T <: ZeroOne}(x :: AbstractArray{T}, temp :: AbstractArray{T})
    s = 0
    for i in 1 : 1000
        myView = view(x, i : i + 9)
        if myView == temp
            s += 1
        end
    end
    return s
end

and obtain unexpected memory allocations (46 Kbytes) in this short loop. Why do it happen and how can I prevent memory allocations and performance degradation?


Solution

  • The reason you're getting allocations is because view(A, i:i+9) creates a small object called a SubArray. This is just a "wrapper" that essentially stores a reference to A and the indices you passed in (i:i+9). Because the wrapper is small (~40 bytes for a one-dimensional object), there are two reasonable choices for storing it: on the stack or on the heap. "Allocations" refer only to heap memory, so if Julia can store the wrapper on the stack it would report no allocations (and would also be faster).

    Unfortunately, some SubArray objects currently (as of late 2017) have to be stored on the heap (EDIT: the situation is much improved in modern Julia versions, see https://julialang.org/blog/2020/08/julia-1.5-highlights/#struct_layout_and_allocation_optimizations). The reason is because Julia is a garbage-collected language, which means that if A is a heap-allocated object that is no longer in use, then A might be freed from memory. The key point is this: currently, references to A from other variables are counted only if those variables are stored on the heap. Consequently, if all SubArrays were stored on the stack, you would have a problem for code like this:

    function create()
        A = rand(1000)
        getfirst(view(A, 1:10))
    end
    
    function getfirst(v)
        gc()   # this triggers garbage collection
        first(v)
    end
    

    Because create doesn't use A again after that call to getfirst, it's not "protecting" A. The risk is that the gc call could end up freeing the memory associated with A (and thus breaking any usage of entries in v itself, since v relies on A), unless having v protects A from being garbage-collected. But currently, stack-allocated variables can't protect heap-allocated memory: the garbage collector only scans variables that are on the heap.

    You can watch this in action with your original function, modified to be slightly less restrictive by getting rid of the (irrelevant, for these purposes) T<:ZeroOne and allowing any T.

    function check_alloc(x::AbstractArray{T}, temp::AbstractArray{T}) where T
        s = 0
        for i in 1 : 1000
            myView = view(x, i : i + 9)
            if myView == temp
                s += 1
            end
        end
        return s
    end
    
    a = collect(1:1010);      # this uses heap-allocated memory
    b = collect(1:10);
    
    @time check_alloc(a, b);  # ignore the first due to JIT-compilation
    @time check_alloc(a, b)
    
    a = 1:1010                # this doesn't require heap-allocated memory
    @time check_alloc(a, b);  # ignore due to JIT-compilation
    @time check_alloc(a, b)
    

    From the first one (with a = collect(1:1010)), you get

    julia> @time check_alloc(a, b)
      0.000022 seconds (1.00 k allocations: 47.031 KiB)
    

    (notice this is ~47 bytes per iteration, consistent with the size of the SubArray wrapper) but from the second (with a = 1:1010) you get

    julia> @time check_alloc(a, b)
      0.000020 seconds (4 allocations: 160 bytes)
    

    There's an "obvious" fix to this problem: change the garbage collector so that stack-allocated variables can protect heap-allocated memory. That will happen some day (EDIT: it has already been done), but it's an extremely complex operation to support properly. So for now, the rule is that any object that contains a reference to heap-allocated memory must be stored on the heap.

    There's one final subtlety: Julia's compiler is quite smart, and in some cases elides the creation of the SubArray wrapper (basically, it rewrites your code in a way that uses the parent array object and the indices separately so that it never needs the wrapper itself). For that to work, Julia has to be able to inline any function calls into the function that created the view. Unfortunately, here == is slightly too big for the compiler to be willing to inline it. If you manually write out the operations that will be performed, then the compiler will elide the view and you'll also avoid allocations.