julia

Why is breaking out of a for loop slower than iterating the whole vector?


I pass in a vector that is sorted in descending order and manually sum until a certain integer is reached. In the first function below, I let the for loop continue until the end, but in the second function I break out of the for loop. Weirdly enough, breaking out of the for loop is three times slower. I expected it to be quicker, as it doesn't have to finish the for loop.

Any ideas?

using BenchmarkTools

function manual_sum_whole_vector(in_vec, i)
    n = 0
    for x in in_vec
        if x >= i
            n += 1
        end
    end
    return n
end

function manual_sum_break(in_vec, i)
    n = 0
    for x in in_vec
        if x >= i
            n += 1
        else
            break
        end
    end
    return n
end

in_vec = collect(100000:-1:1)

@btime manual_sum_whole_vector(in_vec, 12345)
@btime manual_sum_break(in_vec, 12345)

6.575 μs (1 allocation: 16 bytes)
19.625 μs (1 allocation: 16 bytes)

Solution

  • My surmise is that the code without a break is getting vectorized by the compiler, while the code with a break cannot be vectorized and must remain as a scalar loop. Because the loop exit happens almost 90% of the way through the array, the inefficiency of iterating through the last 10% of the array is small compared to the gains from vectorization.