parsingbisonflex-lexeryaccjison

Using Jison to convert a list of commands into an array of objects


I am trying to use Jison, which is a JS port of Bison, the parser generator. My goal is to convert this input:

foo(10)
bar()
foo(28)
baz(28)

into this:

[
  { func: 'foo', arg: 10 },
  { func: 'bar' },
  { func: 'foo', arg: 28 },
  { func: 'baz', arg: 28 }
]

Here is my bison file:

%lex

%%
[0-9]+\b                  return 'INTEGER'
\(                        return 'OPEN_PAREN'
\)                        return 'CLOSE_PAREN'
[\w]+\s*(?=\()            return 'FUNC_NAME'
\n+                       return 'LINE_END'

/lex

%%
expressions
  : expressions expression
  | expression
  ;

expression
  : LINE_END
  | e LINE_END
    {return $1}
  ;

e
  : FUNC_NAME OPEN_PAREN INTEGER CLOSE_PAREN
    {$$ = { func: $1, arg: $3 };}

  | FUNC_NAME OPEN_PAREN CLOSE_PAREN
    {$$ = { func: $1 };}
  ;

The output of the resulting generated parser is { func: 'foo', arg: 10 }. In other words, it only returns the parsed object from the first statement and ignores the rest.

I know my problem has to do with semantic value and the "right side" of expression, but I am pretty lost otherwise.

Any help would be extremely appreciated!


Solution

  • I'm appending a grammar that does what you asked for. The salient changes are:

    1. LINE_END has the regex \n+|$ to also match the end of output.

    2. I've added a start production whose role is only to return the final result.

    3. Rewrote the expression production to produce arrays. I've also removed the {return $1} from the e LINE_END rule since that caused the parser to return prematurely.

    4. Modified the expressions production to concatenate the arrays.

    For the expression and expressions productions, I've used the shorthand syntax for the rules there. For instance expression -> [$1] is equivalent to expression { $$ = [$1] }.

    Here is the grammar:

    %lex
    
    %%
    [0-9]+\b                  return 'INTEGER'
    \(                        return 'OPEN_PAREN'
    \)                        return 'CLOSE_PAREN'
    [\w]+\s*(?=\()            return 'FUNC_NAME'
    \n+|$                     return 'LINE_END'
    
    /lex
    
    %%
    start:
      expressions
      { return $1 }
      ;
    
    expressions
      : expressions expression -> $1.concat($2)
      | expression
      ;
    
    expression
      : LINE_END -> []
      | e LINE_END -> [$1]
      ;
    
    e 
      : FUNC_NAME OPEN_PAREN INTEGER CLOSE_PAREN
        {$$ = { func: $1, arg: $3 };}
    
      | FUNC_NAME OPEN_PAREN CLOSE_PAREN
        {$$ = { func: $1 };}
      ;
    

    An aside: Jison is not a port of Bison. It is a a parser generator whose functioning is strongly inspired by Bison but it has features that Bison does not have and there are some features of Bison that Jison does not support.