I'm working on implementing a parser that is supposed to process the input string, extracting its components, validating them, and then creating a SQL Alchemy query from it. At the moment, I'm working on the first part of the parser and encountered a certain issue. I want to define an exception that checks the correctness of the filter.
Filter definition:
filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
filter_validator).set_name("filter")
I would like to add additional validation for the filter - I have specific words that can be used as filters, they will be defined as a dictionary with aliases, for example:
"animal": "animal",
"dog": "animal",
"cat": "animal",
"pet": "animal"
}
In the provided code, I am using a simple check to see if the filter equals 'w', and if so, I return an exception.
if t[0] == "w":
raise FilterException("Invalid filter")
However, at the moment, this is not happening because my parser throws an exception, but it is not related to filter validation.
ParseException: Expected end of text, found 'and' (at char 15), (line:1, col:16) FAIL: Expected end of text, found 'and' (at char 15), (line:1, col:16)
Could I ask for your help in solving this problem?"
parser:
from pyparsing import Word, Combine, Optional, DelimitedList, alphanums, Suppress, Group, one_of, alphas, \
CaselessLiteral, infix_notation, opAssoc, OneOrMore, Keyword, CaselessKeyword, pyparsing_common, Forward, \
ParseException, ParseSyntaxException, ZeroOrMore
class OrOperation:
def __init__(self, instring, loc, toks):
raise ParseException(instring, loc, "invalid OR given")
class AndOperation:
def __init__(self, instring, loc, toks):
raise ParseException(instring, loc, "invalid AND given")
class FilterException(ParseException):
def __init__(self, pstr):
super().__init__(pstr)
def filter_validator(s, l, t):
if t[0] == "w":
raise FilterException("Invalid filter")
# utils:
comma = Suppress(",")
space = Suppress(" ")
lbrace = Suppress("(")
rbrace = Suppress(")")
and_operator = Suppress(CaselessKeyword("AND"))
or_operator = CaselessKeyword("OR")
search_parser = Forward().set_name("search_expression")
literal_value = Forward().set_name("literal_value").set_results_name("literal_value")
delimited_list_delim = Optional(comma + Optional(space))
delimited_list = DelimitedList(literal_value, delim=delimited_list_delim).set_parse_action(
lambda tokens: ", ".join(tokens))
string_literal = Word(alphanums + "_")
wildcard_literal = Combine(string_literal + "*").set_parse_action(lambda tokens: tokens[0].replace("*", "?"))
delimited_list_literal = lbrace + delimited_list + rbrace
filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
filter_validator).set_name("filter")
literal_value <<= delimited_list_literal | wildcard_literal | string_literal
equals_operator = one_of("= :")
comparison_operator = one_of("> >= < <= ")
not_equals_operator = CaselessLiteral("!=")
contains_operator = CaselessLiteral("~").set_parse_action(lambda tokens: "LIKE")
not_contains_operator = CaselessLiteral("!~").set_parse_action(lambda tokens: "NOT LIKE")
operator = equals_operator | not_equals_operator | contains_operator | not_contains_operator | comparison_operator
operator_term = Combine(Optional(space) + operator + Optional(space)).set_results_name("operator")
expression_term = Group(filter_term + operator_term + literal_value).set_parse_action(filter_validator) | Group(
literal_value)
search_parser <<= infix_notation(expression_term,
[
(and_operator, 2, opAssoc.LEFT,
lambda instring, loc, toks: AndOperation(instring, loc, toks)),
(or_operator, 2, opAssoc.LEFT,
lambda instring, loc, toks: OrOperation(instring, loc, toks))
])
try:
result = search_parser.parse_string("w~(a, b c, d)")
print(result.dump())
except FilterException as e:
print("Filter failed:", e)
search_parser.run_tests('''
asas
was*
(as, b,c d)
((as, b,c d))
w=a
w=a*
w=(a, b c, d)
w:(a, b c, d)
w!=(a, b c, d)
w~(a, b c, d)
w!~(a, b c, d)
w>=(a, b c, d)
a>=(a, b c, d) and a=(a, b c, d)
w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
(w>=(a, b c, d) or w!~(a, b c, d)) or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d) or w=(a, b c, d) and w=(a, b c, d)
a>=(a, b c, d) and w!~(a, b c, d) or w=(a, b c, d) and w!=(a, b c, d)
''')
Pyparsing's internal logic uses ParseExceptions
pretty heavily, as it works through the parser structure of nested ParserElements
. Since FilterException
extends ParseException
, it gets pulled in with all the rest of this try-and-retry internal exception raising and handling.
I changed your exception to this, and I think this gets things to come out closer to what you expect:
class FilterException(Exception):
def __init__(self, pstr):
self.msg = pstr
A couple other notes on your parser:
Optional(space)
for space skipping isn't going to work well, given that pyparsing implicitly skips spaces already. Instead, try:
filter_term = Word(alphas, as_keyword=True).set_results_name("filter").set_parse_action(
filter_validator).set_name("filter")
AndOperation
and OrOperation
take constructor signatures that already align with parse action signatures, so they can be used in infix_notation
as just:
search_parser <<= infix_notation(expression_term,
[
(and_operator, 2, opAssoc.LEFT, AndOperation),
(or_operator, 2, opAssoc.LEFT, OrOperation)
])
expression_term.run_tests('''
asas
was*
(as, b,c d)
((as, b,c d))
w=a
w=a*
w=(a, b c, d)
w:(a, b c, d)
w!=(a, b c, d)
w~(a, b c, d)
w!~(a, b c, d)
w>=(a, b c, d)
''')
search_parser.run_tests('''
a>=(a, b c, d) and a=(a, b c, d)
w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
(w>=(a, b c, d) or w!~(a, b c, d)) or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d) or w=(a, b c, d) and w=(a, b c, d)
a>=(a, b c, d) and w!~(a, b c, d) or w=(a, b c, d) and w!=(a, b c, d)
''')