lualpeg

lpeg.Cmt and empty loop body


LPEG considers loop body consisting of lpeg.Cmt (…) empty under Lua 5.1, 5.2, 5.3, 5.4 and LuaJIT, even though I am sure that the function passed to lpeg.Cmt returns an advanced position, or false for failure.

Minimal example causing 'loop body may accept empty string' error:

local lpeg = require 'lpeg'
local pattern = lpeg.Cmt (lpeg.P (true), function (s, p)
    local n = f (s, p) -- f is not calculatable beforehand.
    if not ((n or 0) > 0) then
        return false -- cannot advance, therefore fail.
    else
        return p + n, string.sub (s, p, n) -- advance.
    end
end) ^ 1

I don't know beforehand, how much or what kind of input lpeg.Cmt (…) will consume; but it will either consume some, or fail after the function returns false.

Is there a workaround?


Solution

  • The workaround is to make lpeg.Cmt(…) consume at least one character. Since the function passed to lpeg.Cmt(…) gets the whole input string, it will not cause the first character of the relevant part of the string to be lost. The only case that this workaround does not work will be at the very end of the input string, if the function passed to lpeg.Cmt(…) may accept an empty string.

    The workaround:

    local lpeg = require 'lpeg'
    local pattern = lpeg.Cmt (lpeg.P (1), function (s, p)
        local p = p - 1 -- compensate for the one character consumed by lpeg.P (1).
        local n = f (s, p) -- f is not calculatable beforehand.
        if not ((n or 0) > 0) then
            return false -- cannot advance, therefore fail.
        else
            return p + n, string.sub (s, p, n) -- advance.
        end
    end) ^ 1
    

    Note lpeg.P (true)lpeg.P (1) and inserted local p = p - 1.