I started a project with a grammar that used %
(and the word mod
) for modulus operators, and now I would like to add %
as a trailing unary operator to divide by 100.
A few notes, I don't work with a C-based language, I have implemented my own tokenizer/compiler using the XML output from bison. And my choice of steps are critial for my implementation.
Is there a way I can make my grammar to compile without any shift/reduce errors in a LALR(1) compiler?
Basically the following statements are all valid:
5%
-> 0.055%%5
-> 0.05 mod 55%%%5
-> 0.0005 mod 5
etc.I just don't know how to formulate this into my grammar:
%token S_NUM
%%
default: mod_term ;
mod_term: _mod_value
| percent_term ;
_mod_value: mod_term O_PERCENT percent_term ;
percent_term: _percent_value
| value ;
_percent_value: value O_PERCENT ;
value: S_NUM ;
%%
I also compile it using the following statement:
bison -v --report=all --warnings=no-other -Werror=conflicts-sr --xml test.y -o test.y.xml
(Where I force shift/reduce as errors because of my environment)
Any ideas? I've played around with %left
and %right
specifiers, but no luck.
The ambiguity you have here is between '%'
being a postfix operator and an infix operator. This is very similar to the common expression parser issue with '-'
being both a prefix and an infix operator, and you can solve it the same way with an explicit %prec
directive. The traditional way to write this would be:
%left '%' /* left-associative infix operator */
%nonassoc POSTFIX /* postfix operations are higher precedence */
%token VAL
%%
expr: expr '%' expr
| expr '%' %prec POSTFIX
| VAL
;
using precedence to solve both the associative ambiguity of infix-% and the precedence ambiguity between infix and postfix.
To solve it without the precedence rules, you need something like:
%token S_NUM O_PERCENT
%%
default: mod_term ;
mod_term: _mod_value
| _mod_value O_PERCENT mod_term ;
_mod_value: _mod_value O_PERCENT ;
| S_NUM
;
which makes the infix-% right associative instead of left associative. Unfortunately I see no way of solving this without using precedence rules that also makes infix-% left associative. This is due to the fact that you can't decide whether a given '%'
token is infix or postfix until you see the token after it, so the non-terminal before the '%'
needs to be the same for both rules (_mod_value
here or expr
in the %prec code)