for-loopwhile-looplua

While loops are testing faster than for loops - should I use while loops instead of for loops?


I am using Nginx + Lua (OpenResty, LuaJIT), and I did some performance tests on various loops.

local ngx_log = ngx.log
-- https://openresty-reference.readthedocs.io/en/latest/Lua_Nginx_API/#nginx-log-level-constants
local ngx_LOG_TYPE = ngx.STDERR


local N=1e8

t0=os.clock()
a = 0
while a < N do
    a = a + 1
end
t1=os.clock()-t0
ngx_log(ngx_LOG_TYPE,"While Global " .. t1 .. " " .. math.floor(t1/t1*100+0.5))

t0=os.clock()
local a = 0
while a < N do
    a = a + 1
end
t2=os.clock()-t0
ngx_log(ngx_LOG_TYPE,"While Local " .. t2 .. " " .. math.floor(t1/t2*100+0.5))

t0=os.clock()
b = 0
for i=1,N do
    b = b + 1
end
t3=os.clock()-t0
ngx_log(ngx_LOG_TYPE,"For Global " .. t3 .. " " .. math.floor(t1/t3*100+0.5))

t0=os.clock()
local b = 0
for i=1,N do
    b = b + 1
end
t4=os.clock()-t0
ngx_log(ngx_LOG_TYPE,"For Local " .. t4 .. " " .. math.floor(t1/t4*100+0.5))

Here is the output:

[lua] test.lua:14: While Global 0.048999999999999 100, client: 127.0.0.1, server: localhost, request: "GET /test.lua HTTP/1.1", host: "localhost"
[lua] test.lua:22: While Local 0.030000000000001 163, client: 127.0.0.1, server: localhost, request: "GET /test.lua HTTP/1.1", host: "localhost"
[lua] test.lua:30: For Global 0.057000000000002 86, client: 127.0.0.1, server: localhost, request: "GET /test.lua HTTP/1.1", host: "localhost"
[lua] test.lua:38: For Local 0.036000000000001 136, client: 127.0.0.1, server: localhost, request: "GET /test.lua HTTP/1.1", host: "localhost"

They execute very fast, but the while loops are a tiny fraction faster than the for loops.

Should I change all my code to use while loops instead of for loops?


Solution

  • Performance profiling and optimization are specific to a given process. As is, there is almost no point in testing such arbitrary elements of code, as they do not necessarily reflect real world use cases. Performance testing should be reserved for when a real bottleneck is observed, and actual alternatives that produce the same result can be developed and re-tested.

    This becomes apparent when you consider that the desired result in the examples shown is an integer, which is achieved here by incrementing a variable some number of times. The comparison becomes rather unfair when you consider that the for loops have to do this incrementation twice.

    In other words:

    The for loops have the overhead of making available the local control value of i, containing the current iteration. This is in addition to the variable being incremented in the body of the loop.

    The while loops use the variable being incremented in the body of the loop as the control, reducing the amount of instructions required.

    Faster than any of these loops is to just write a = 1e8 - so surely using constants is faster than using loops, thus we should always "use constants instead of loops"? Such a generalization is not particularly useful as the use cases obviously differ. Generalizing a performance preference between for and while is equally problematic as their use cases can also differ.

    When you do find a certain construct to be more performant than another for the same use case (i.e., they produce the same result), then the only other question to ask yourself is if any degradation in code quality is worth the performance bump. If so, then optimize away.

    For example, if you told me the difference in performance between these two constructs below was measured somewhere in the range of micro- to nanoseconds, unless performance was unbelievably critical, I am choosing the one that's vastly easier to understand and maintain, regardless of its relative performance.

    for key, value in pairs(t) do
        print(key, value)
    end                           
                                  
    ----
    
    do
        local key = next(t)
    
        while key do
            local value = t[key]
            print(key, value)
            key = next(t, key)        
        end                           
    end  
    

    TL;DR: Do your best to use the correct construct for a given use case and save performance profiling for real world problems.