[SOLVED] Compute row sums and column sums efficiently in Julia

Compute row sums and column sums efficiently in Julia

Consider the code below

test_vector = 1:100
group_size = 10
num_groups = div(length(test_vector), group_size)
res_1 = sum(reshape(test_vector, group_size, num_groups), dims=1)
res_2 = sum(reshape(test_vector, group_size, num_groups), dims=2)

This code does what I want. Essentially, I am reshaping the vector into a 10x10 matrix, and compute the row sums and column sums. However, is there a way to accomplish this more efficiently? I know that reshape allocates memory every time. I intend to repeat this code multiple times in my program (i.e. I will write it as a function with test_vector and group_size being arguments, and call it many times). What is the most efficient way (in terms of speed and subsequently memory allocation) to accomplish this?

I tried to adapt the code here but it doesn't quite accomplish what I want. Can I get a hint? Thank you.

Solution

In Julia reshape is not allocating.

julia> v1 = rand(100);

julia> @allocated v2 = reshape(v1, 10, 10)
96

(only the information about reshaping is stored, data is not copied)

This means that mutating the original mutates the reshape:

julia> v1[1] = 99; @show v2[1,1];
v2[1, 1] = 99.0

Indeed both variables point to the same place in memory:

julia> Int(pointer(v1)) == Int(pointer(v2))
true

If you want to have a real copy you would need to have done

julia> @allocated v3 = collect(reshape(v1, 10, 10))
992

sums benchmark the same whether you use v2 or v3, hence reshape is the most efficient option for you (unless you perhaps write some dedicated loop and SIMD it).