pythonparsingpyparsinginfix-notation

Why parser return ParseException with wrong message?


I'm working on implementing a parser that is supposed to process the input string, extracting its components, validating them, and then creating a SQL Alchemy query from it. At the moment, I'm working on the first part of the parser and encountered a certain issue. I want to define an exception that checks the correctness of the filter.

Filter definition:

filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
    filter_validator).set_name("filter")

I would like to add additional validation for the filter - I have specific words that can be used as filters, they will be defined as a dictionary with aliases, for example:

    "animal": "animal",
    "dog": "animal",
    "cat": "animal",
    "pet": "animal"
}

In the provided code, I am using a simple check to see if the filter equals 'w', and if so, I return an exception.

    if t[0] == "w":
        raise FilterException("Invalid filter")

However, at the moment, this is not happening because my parser throws an exception, but it is not related to filter validation.

ParseException: Expected end of text, found 'and' (at char 15), (line:1, col:16) FAIL: Expected end of text, found 'and' (at char 15), (line:1, col:16)

Could I ask for your help in solving this problem?"

parser:

from pyparsing import Word, Combine, Optional, DelimitedList, alphanums, Suppress, Group, one_of, alphas, \
    CaselessLiteral, infix_notation, opAssoc, OneOrMore, Keyword, CaselessKeyword, pyparsing_common, Forward, \
    ParseException, ParseSyntaxException, ZeroOrMore



class OrOperation:
    def __init__(self, instring, loc, toks):
        raise ParseException(instring, loc, "invalid OR given")


class AndOperation:
    def __init__(self, instring, loc, toks):
        raise ParseException(instring, loc, "invalid AND given")


class FilterException(ParseException):
    def __init__(self, pstr):
        super().__init__(pstr)


def filter_validator(s, l, t):
    if t[0] == "w":
        raise FilterException("Invalid filter")


# utils:
comma = Suppress(",")
space = Suppress(" ")
lbrace = Suppress("(")
rbrace = Suppress(")")
and_operator = Suppress(CaselessKeyword("AND"))
or_operator = CaselessKeyword("OR")

search_parser = Forward().set_name("search_expression")
literal_value = Forward().set_name("literal_value").set_results_name("literal_value")

delimited_list_delim = Optional(comma + Optional(space))
delimited_list = DelimitedList(literal_value, delim=delimited_list_delim).set_parse_action(
    lambda tokens: ", ".join(tokens))

string_literal = Word(alphanums + "_")
wildcard_literal = Combine(string_literal + "*").set_parse_action(lambda tokens: tokens[0].replace("*", "?"))
delimited_list_literal = lbrace + delimited_list + rbrace

filter_term = Combine(Optional(space) + Word(alphas) + Optional(space)).set_results_name("filter").set_parse_action(
    filter_validator).set_name("filter")
literal_value <<= delimited_list_literal | wildcard_literal | string_literal

equals_operator = one_of("= :")
comparison_operator = one_of("> >= < <= ")
not_equals_operator = CaselessLiteral("!=")
contains_operator = CaselessLiteral("~").set_parse_action(lambda tokens: "LIKE")
not_contains_operator = CaselessLiteral("!~").set_parse_action(lambda tokens: "NOT LIKE")
operator = equals_operator | not_equals_operator | contains_operator | not_contains_operator | comparison_operator
operator_term = Combine(Optional(space) + operator + Optional(space)).set_results_name("operator")
expression_term = Group(filter_term + operator_term + literal_value).set_parse_action(filter_validator) | Group(
    literal_value)

search_parser <<= infix_notation(expression_term,
                                 [
                                     (and_operator, 2, opAssoc.LEFT,
                                      lambda instring, loc, toks: AndOperation(instring, loc, toks)),
                                     (or_operator, 2, opAssoc.LEFT,
                                      lambda instring, loc, toks: OrOperation(instring, loc, toks))
                                 ])

try:
    result = search_parser.parse_string("w~(a, b c, d)")
    print(result.dump())
except FilterException as e:
    print("Filter failed:", e)

search_parser.run_tests('''
asas
was*
(as, b,c d)
((as, b,c d))
w=a
w=a*
w=(a, b c, d)
w:(a, b c, d)
w!=(a, b c, d)
w~(a, b c, d)
w!~(a, b c, d)
w>=(a, b c, d)
a>=(a, b c, d) and a=(a, b c, d)
w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
(w>=(a, b c, d) or w!~(a, b c, d))  or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d)  or (w=(a, b c, d) and w=(a, b c, d))
w>=(a, b c, d) or w!~(a, b c, d)  or w=(a, b c, d) and w=(a, b c, d)
a>=(a, b c, d) and w!~(a, b c, d)  or w=(a, b c, d) and w!=(a, b c, d)
''')

Solution

  • Pyparsing's internal logic uses ParseExceptions pretty heavily, as it works through the parser structure of nested ParserElements. Since FilterException extends ParseException, it gets pulled in with all the rest of this try-and-retry internal exception raising and handling.

    I changed your exception to this, and I think this gets things to come out closer to what you expect:

    class FilterException(Exception):
        def __init__(self, pstr):
            self.msg = pstr
    

    A couple other notes on your parser:

    expression_term.run_tests('''
        asas
        was*
        (as, b,c d)
        ((as, b,c d))
        w=a
        w=a*
        w=(a, b c, d)
        w:(a, b c, d)
        w!=(a, b c, d)
        w~(a, b c, d)
        w!~(a, b c, d)
        w>=(a, b c, d)
        ''')
        
    search_parser.run_tests('''
        a>=(a, b c, d) and a=(a, b c, d)
        w>=(a, b c, d) and w=(a, b c, d) and w=(a, b c, d)
        w>=(a, b c, d) or (w=(a, b c, d) and w=(a, b c, d))
        (w>=(a, b c, d) or w!~(a, b c, d))  or (w=(a, b c, d) and w=(a, b c, d))
        w>=(a, b c, d) or w!~(a, b c, d)  or (w=(a, b c, d) and w=(a, b c, d))
        w>=(a, b c, d) or w!~(a, b c, d)  or w=(a, b c, d) and w=(a, b c, d)
        a>=(a, b c, d) and w!~(a, b c, d)  or w=(a, b c, d) and w!=(a, b c, d)
        ''')