lualua-patterns

How to write a lua pattern that is aware of escaped characters?


I want to write a patterns that takes a string like this /a/b/c and extracts a, b, and c. a, b, and c are optional, so /// is a valid input. Currently I have this: "^%/(.-)%/(.-)%/(.-)$". This works, except if my input is /</>/b/c, I get matches: <, >, b/c. Obviously the second / should be escaped like this: /<\\/>/b/c, however this gives me: <\, >, b/c. Is there a way to write this pattern such that /<\\/>/b/c would give me: <\/>, b, c? I know I could change the first .- to a .+ and that would solve this exact issue, but it doesn't solve the larger issue(i.e. what if the escaped slash is in section b).


Solution

  • It is impossible to achieve using a single Lua pattern, but you can chain a few of them:

    local s = "/<\\/>//b\\\\/c"  -- 4 payloads here  (the second one is empty)
    for x in s
          :gsub("/", "/\1")       -- make every payload non-empty by prepending string.char(1)
          :gsub("\\(.)", "\2%1")  -- replace magic backslashes with string.char(2)
          :gsub("%f[/\2]/", "\0") -- replace non-escaped slashes with string.char(0)
          :gsub("[\1\2]", "")     -- remove temporary symbols string.char(1) and string.char(2)
          :gmatch"%z(%Z*)"        -- split by string.char(0)
    do
       print(x)
    end
    

    Output:

    </>
    
    b\
    c
    

    Or, if you want a single statement instead of a loop:

    local s = "/<\\/>/b/c"  -- 3 payloads here
    local a, b, c = s
          :gsub("/", "/\1")       -- make every payload non-empty by prepending string.char(1)
          :gsub("\\(.)", "\2%1")  -- replace magic backslashes with string.char(2)
          :gsub("%f[/\2]/", "\0") -- replace non-escaped slashes with string.char(0)
          :gsub("[\1\2]", "")     -- remove temporary symbols string.char(1) and string.char(2)
          :match"%z(%Z*)%z(%Z*)%z(%Z*)"  -- split by string.char(0)