arraysmemoryjuliareshape

Compute row sums and column sums efficiently in Julia


Consider the code below

test_vector = 1:100
group_size = 10
num_groups = div(length(test_vector), group_size)
res_1 = sum(reshape(test_vector, group_size, num_groups), dims=1)
res_2 = sum(reshape(test_vector, group_size, num_groups), dims=2)

This code does what I want. Essentially, I am reshaping the vector into a 10x10 matrix, and compute the row sums and column sums. However, is there a way to accomplish this more efficiently? I know that reshape allocates memory every time. I intend to repeat this code multiple times in my program (i.e. I will write it as a function with test_vector and group_size being arguments, and call it many times). What is the most efficient way (in terms of speed and subsequently memory allocation) to accomplish this?

I tried to adapt the code here but it doesn't quite accomplish what I want. Can I get a hint? Thank you.


Solution

  • In Julia reshape is not allocating.

    julia> v1 = rand(100);
    
    julia> @allocated v2 = reshape(v1, 10, 10)
    96
    

    (only the information about reshaping is stored, data is not copied)

    This means that mutating the original mutates the reshape:

    julia> v1[1] = 99; @show v2[1,1];
    v2[1, 1] = 99.0
    

    Indeed both variables point to the same place in memory:

    julia> Int(pointer(v1)) == Int(pointer(v2))
    true
    

    If you want to have a real copy you would need to have done

    julia> @allocated v3 = collect(reshape(v1, 10, 10))
    992
    

    sums benchmark the same whether you use v2 or v3, hence reshape is the most efficient option for you (unless you perhaps write some dedicated loop and SIMD it).