lualpeg

How to use LPeg to replace parts of the matched grammar?


Given a LPeg grammar, I want to do a find/replace on some of the sub-matches.

For example, suppose that I want to replace all the occurrences of the letter "a" that are outside of double quotes. I came up with a grammar for that:

local re = require "re"
local pattern = re.compile([[
    line <- (quoted / 'a' / .)*

    quoted <- '"' rest
    rest   <- '"' / . rest
]])

It seems that re.gsub is not useful here, because it would mess with the parts inside quotes. The best I could come up with was to use a table capture, where we capture everything besides the "a". But it was unwieldy because the pattern returns a table instead of a string.

line <- {| ({quoted} / 'a' / {.})* |}

I also looked into substitution captures, but got stuck because I needed to add ->'%1' around everything other than the 'a'.


Solution

  • Turns out that the "%1" bit has to do with string captures, not of substitution captures. I can solve the problem by adding a capture only around the parts that I want to replace, and I can leave the other parts alone.

    local re = require "re"
    local pattern = re.compile([[
        line <- {~ (quoted / 'a'->'' / .)* ~}
    
        quoted <- '"' rest
        rest   <- '"' / . rest
    ]])
    

    Pay attention to operator precedence, because '->' binds tightly

    ('a' 'b')->''