pythonstringparsingautosys

Parse boolean expression in python


Currently, I have a Boolean expression which supports & (logical AND), | (logical OR), (, ) (parentheses) operators along with status codes like s, f, d, n, t and job names.

The status codes represent the status of a job. (Eg: s = success, f = failure, etc...) and the job name is enclosed within parentheses with an optional argument which is a number within quotes.

Example i/p:

( s(job_A, "11:00") & f(job_B) ) | ( s(job_C) & t(job_D) )

My requirement is for such a given string in Python, I need to replace the existing job names with new job names containing a prefix and everything else should remain the same:

Example o/p:

( s(prefix_job_A, "11:00") & f(prefix_job_B) ) | ( s(prefix_job_C) & t(prefix_job_D) )

This logical expression can be arbitrarily nested like any Boolean expression and being a non-regular language we can't use regexes.

Please note: The job names are NOT known before-hand so we can't statically store the names in a dictionary and perform a replacement.

The current approach I have thought of is to generate an expression tree and perform replacements in the OPERAND nodes of that tree, however I am not sure how to proceed with this. Is there any library in python which can help me to define the grammar to build this tree? How do I specify the grammar?

Can someone help me with the approach?

Edit: The job names don't have any particular form. The minimum length of a job name is 6 and the job names are alphanumeric with underscores.


Solution

  • Given that we can assume that job names are alphanumeric + _ and length at least 6, we should be able to do this just with a regex since it appears nothing else in the given strings look like that.

    import regex
    
    exp = '( s(job__A, "11:00") & f(job__B) ) | ( s(job__C) & t(job__D) )'
    name_regex = "([a-zA-Z\d_]{6,})"  # at least 6 alphanumeric + _ characters
    prefix = "prefix_"
    
    new_exp = regex.sub(name_regex, f"{prefix}\\1", exp)
    print(new_exp)