I'm trying to learn LPeg's re module
and it has been quite an interesting experience, specially since the official documentation is so nice.
However there are some topics that seem to be poorly explaned there. For example the named group capture
construction: {:name: p :}
.
Consider the following example, I don't understand why it does not match:
print(re.compile
[[item <- ('<' {:tag: %w+!%w :} '>' item+ '</' =tag '>') / %w+!%w]]
:match[[<person><name>James</name><address>Earth</address></person>]])
-- outputs nil
Can anyone help me understand what is going wrong here? I thought quite a bit about that, and it really seems like I'm missing something important.
This is a late answer but you can try following pattern
result = re.compile[[
item <- ({| %s* '<' {:tag: %w+ :} %s* '>' (item / %s* { (!(%s* '<') .)+ }) %s* '</' =tag '>' |})+
]]:match[[
<person>
<name>
James
</name>
<address>Earth</address>
</person>
]]
which uses tables captures to parse XML w/ whitespace for elements texts stripped
tag = "person"
[1] = {
tag = "name"
[1] = "James"
}
[2] = {
tag = "address"
[1] = "Earth"
}