I'm new to the "LPeg" and "re" modules of Lua, currently I want to write a pattern based on following rules:
For example, below strings should match the pattern:
,sys.dba_objects, --should return "sys.dba_objects"
SyS.Dba_OBJECTS
cdb_objects
dba_hist_snapshot) --should return "dba_hist_snapshot"
Currently my pattern is below which can only match non-alphanumeric+string in upper case :
p=re.compile[[
pattern <- %W {owner* name}
owner <- 'SYS.'/ 'PUBLIC.'
name <- {prefix %a%a (%w/"_"/"$"/"#")+}
prefix <- "GV_$"/"GV$"/"V_$"/"V$"/"DBA_"/"ALL_"/"CDB_"
]]
print(p:match(",SYS.DBA_OBJECTS"))
My questions are:
Highly appreciated if you can provide the pattern or related function.
You can try to tweak this grammar
local re = require're'
local p = re.compile[[
pattern <- ((s? { <name> }) / s / .)* !.
name <- (<owner> s? '.' s?)? <prefix> <ident>
owner <- (S Y S) / (P U B L I C)
prefix <- (G V '_'? '$') / (V '_'? '$') / (D B A '_') / (C D B '_')
ident <- [_$#%w]+
s <- (<comment> / %s)+
comment <- '--' (!%nl .)*
A <- [aA]
B <- [bB]
C <- [cC]
D <- [dD]
G <- [gG]
I <- [iI]
L <- [lL]
P <- [pP]
S <- [sS]
U <- [uU]
V <- [vV]
Y <- [yY]
]]
local m = { p:match[[
,sys.dba_objects, --should return "sys.dba_objects"
SyS.Dba_OBJECTS
cdb_objects
dba_hist_snapshot) --should return "dba_hist_snapshot"
]] }
print(unpack(m))
. . . prints match table m
:
sys.dba_objects SyS.Dba_OBJECTS cdb_objects dba_hist_snapshot
Note that case-insensitivity is quite hard to achieve out of the lexer so each letter has to get a separate rule -- you'll need more of these eventually.
This grammar is taking care of the comments in your sample and skips them along with whitespace so matches after "should return" are not present in output.
You can fiddle with prefix
and ident
rules to specify additional prefixes and allowed characters in object names.
Note: !.
means end-of-file. !%nl
means "not end-of-line". ! p
and & p
are constructing non-consuming patterns i.e. current input pointer is not incremented on match (input is only tested).
Note 2: print
-ing with unpack
is a gross hack.
Note 3: Here is a tracable LPeg re that can be used to debug grammars. Pass true
for 3-rd param of re.compile
to get execution trace with test/match/skip action on each rule and position visited.