javascriptregexparsingpegjs

Parse expression with JavaScript using PEG


I have written the code to parse the following expression

!=10 AND <=99

To

{
    operator: 'AND',
    values: [
        {
            op: '!=',
            value: '10',
        },
        {
            op: '<=',
            value: '99',
        },
    ],
}

using PEG.js. Please check the code sandbox.

But when I parse the following expression I am not getting the expected output.

=10 AND <=99 OR =1000

I want the following output

{
    operator: 'or',
    values: [
        {
            operator: 'and',
            values: [
                {
                    op: '=',
                    value: '10',
                },
                {
                    op: '<=',
                    value: '99',
                },
            ],
        },
        {
            op: '=',
            value: '1000',
        },
    ],
}

code sandbox:https://codesandbox.io/s/javascript-testing-sandbox-forked-h5xh5?file=/src/index.js


Solution

  • Do not use string functions to parse comparison operators, make them a part of your grammar. For chained operations like 'AND' use a rule like this:

    operation = head:argument tail:(ws AND ws argument)* 
    

    and handle head and tail accordingly in the code block.

    Complete example:

    expression = and
    
    and = head:or tail:(ws "OR" ws or)* { 
        return tail.length ? {op: 'or', values: [head].concat(tail.map(r => r.pop()))} : head
    } 
    
    or = head:not tail:(ws "AND" ws not)* { 
        return tail.length ? {op: 'and', values: [head].concat(tail.map(r => r.pop()))} : head
    }
    
    not 
        = "NOT" ws right:not { return {op: 'not', right} }
        / factor
    
    factor
        = op:operator value:primary { return {op, value} }
        / value:primary { return {value} }
    
    operator = '!=' / '<=' / '>=' / '<' / '>' / '='  
    
    primary = name / number / '(' expression ')'
    
    name = a:[A-Za-z_] b:[0-9a-zA-Z_]* { return a + b.join("") }
    
    number = a:[0-9]+ { return a.join("") }
    
    ws = [ \t]+