This was spun off from the comments on this question.
As I understand, in the PEG grammar, it's possible to implement a non-greedy search by writing S <- E2 / E1 S
(or S = pattern E2 if possible or pattern E1 and continued S).
However, I don't want to capture E2 in the final pattern - I want to capture up to E2. When trying to implement this in LPEG I've run into several issues, including 'Empty loop in rule' errors when building this into a grammar.
How would we implement the following search in a LPEG grammar: [tag] foo [/tag]
where we want to capture the contents of the tag in a capture table ('foo' in the example), but we want to terminate before the ending tag? As I understand from the comments on the other question, this should be possible, but I can't find an example in LPEG.
Here's a snippet from the test grammar
local tag_start = P"[tag]"
local tag_end = P"[/tag]"
G = P{'Pandoc',
...
NotTag = #tag_end + P"1" * V"NotTag"^0;
...
tag = tag_start * Ct(V"NotTag"^0) * tag_end;
}
It's me again. I think you need better understanding about LPeg captures. Table capture (lpeg.Ct
) is a capture that gathers your captures in a table. As there's no simple captures (lpeg.C
) specified in NotTag
rule, the final capture would become an empty table {}
.
Once more, I recommend you start from lpeg.re
because it's more intuitive.
local re = require('lpeg.re')
local inspect = require('inspect')
local g = re.compile[=[--lpeg
tag <- tag_start {| {NotTag} |} tag_end
NotTag <- &tag_end / . NotTag
tag_start <- '[tag]'
tag_end <- '[/tag]'
]=]
print(inspect(g:match('[tag] foo [/tag]')))
-- output: { " foo " }
Additionally, S <- E2 / E1 S
is not S <- E2 / E1 S*
, these two are not equivalent.
However, if I were to do the same task, I won't try to use a non-greedy match, as non-greedy matches are always slower than greedy match.
tag <- tag_start {| {( !tag_end . (!'[' .)* )*} |} tag_end
Combining not-predicate and greedy matching is enough.