when i try to use$ bison --output=calcy.c -d calc.y i get this msg:
calc.y: conflicts: 2 shift/reduce
and this is my code
program:DEBUT
corps
FIN
;
corps:liste_declaration
liste_instruction
;
liste_declaration:
| declaration";"liste_declaration
;
declaration: ENTIER IDENTIFICATEUR
;
liste_instruction:
|instruction";"liste_instruction
;
instruction: IDENTIFICATEUR"="expression
| DEBUT liste_instruction FIN
| SI expression
ALORS instruction
SINON instruction
| SI expression ALORS instruction
| TANTQUE expression FAIRE instruction
| ECRIRE liste_argument
| LIRE IDENTIFICATEUR
;
liste_argument:
| expression
| expression ";"liste_argument
;
expression:expression"<"expression_simple
|expression">"expression_simple
|expression"=="expression_simple
|expression"!="expression_simple
|expression"<="expression_simple
|expression">="expression_simple
|expression_simple
;
expression_simple:expression_simple"+"terme
|expression_simple"-"terme
|"("expression_simple")"
|terme
|"-"terme
;
terme:terme"*"facteur
|terme"/"facteur
|terme"^"facteur
|facteur
;
facteur:IDENTIFICATEUR|NOMBRE
;
I searched a lot and I think it's caused by ambiguity, but I cant find where it is.
I used the option -v
as Mr Chris Dodd suggested in a comment, and I got this in my .output file.nv
Grammar
0 $accept: program $end
1 program: DEBUT corps FIN
2 corps: liste_declaration liste_instruction
3 liste_declaration: /* empty */
4 | declaration ";" liste_declaration
5 declaration: ENTIER IDENTIFICATEUR
6 liste_instruction: /* empty */
7 | instruction ";" liste_instruction
8 instruction: IDENTIFICATEUR "=" expression
9 | DEBUT liste_instruction FIN
10 | SI expression ALORS instruction SINON instruction
11 | SI expression ALORS instruction
12 | TANTQUE expression FAIRE instruction
13 | ECRIRE liste_argument
14 | LIRE IDENTIFICATEUR
15 liste_argument: /* empty */
16 | expression
17 | expression ";" liste_argument
18 expression: expression "<" expression_simple
19 | expression ">" expression_simple
20 | expression "==" expression_simple
21 | expression "!=" expression_simple
22 | expression "<=" expression_simple
23 | expression ">=" expression_simple
24 | expression_simple
25 expression_simple: expression_simple "+" terme
26 | expression_simple "-" terme
27 | terme
28 terme: terme "*" facteur
29 | terme "/" facteur
30 | facteur
31 facteur: X "^" facteur
32 | X
33 X: "(" expression_simple ")"
34 | "-" X
35 | IDENTIFICATEUR
36 | NOMBRE
These are the states with conflicts:
state 33
16 liste_argument: expression .
17 | expression . ";" liste_argument
18 expression: expression . "<" expression_simple
19 | expression . ">" expression_simple
20 | expression . "==" expression_simple
21 | expression . "!=" expression_simple
22 | expression . "<=" expression_simple
23 | expression . ">=" expression_simple
";" shift, and go to state 53
"<" shift, and go to state 40
">" shift, and go to state 41
"==" shift, and go to state 42
"!=" shift, and go to state 43
"<=" shift, and go to state 44
">=" shift, and go to state 45
";" [reduce using rule 16 (liste_argument)]
$default reduce using rule 16 (liste_argument)
state 56
10 instruction: SI expression ALORS instruction . SINON instruction
11 | SI expression ALORS instruction .
SINON shift, and go to state 70
SINON [reduce using rule 11 (instruction)]
$default reduce using rule 11 (instruction)
You have two shift-reduce conflicts, which have completely different causes. Only one is strictly speaking an ambiguity; ironically, it is the easier one to fix.
One is the classic "dangling-else" ambiguity (which Wikipedia's French-language page suggests might be translated as sinon pendant. This ambiguity is easy to fix or workaround:
Add %expect 1
to your bison file. (Bison lets you put this declaration anywhere in the file, so it would be reasonable to put it just before or just after the instruction
definition. But in other yacc dialects, you'd need to put it before the first %%
in the file.)
All this does is to tell the parser generator to suppress the warning message if there is exactly one shift-reduce conflict. The conflict will be resolved using the default resolution, which is to prefer to shift the SINON
token. As it happens, this is exactly what you want in this case, which is why it is the default resolution.
Use precedence declarations to make shifting SINON
preferable to reducing an ALORS
production. (Productions are named after the last terminal in the production, not the first one.)
This has precisely the same effect as the first one, but in a slightly more robust way since it is not dependent on the total count of shift-reduce conflicts.
Rewrite the grammar to make SI ... ALORS ... SINON
unambiguous.
This is arguably the best solution, but it is a lot more work since it forces you to consider all compound statements which can end with a statement. In your case, there are just two -- SI and TANTQUE -- but you might want to add other looping constructs later. You can find sample code in the Wikipedia page (which is currently quite a mess, so it may not be the best place to look at the moment) or by searching for "dangling else" and using some discretion.
;
The other shift-reduce conflict is due to the fact that you use ;
both to separate statements and to separate arguments in a ECRIRE
statement. That means that when the parser reaches a ;
while analysing an ECRIRE
statement, it cannot know whether or not the ;
terminates the statement or not. If it's the end of the statement, the parser must reduce ECRIRE liste_argument
to instruction
; otherwise, it needs to shift the ;
in order to add another expression
to the liste_argument
.
This is not, strictly speaking, ambiguous, because your language requires all statements to start with a keyword or with IDENTIFICATEUR "="
, and neither of these can be the start of an expression. So the parser would only need to look at a maximum of three symbols (including the ;
) in order to know for sure whether what comes next is an expression or a statement.
That fact doesn't help you much, since bison parsers will normally only look at the next symbol, which is the ;
, and must make reduction decisions based on that. So even though the grammar is not ambiguous -- every valid program can only be parsed in one way -- the bison parser cannot know which parse to use at the moment it needs to know.
It is possible to fix this, because it is possible to write a one-symbol lookahead grammar for any language which can be parsed with bounded lookahead. But it's tedious and the resulting grammars are big, clunky, and hard to read.
Also, it may be the case that some future addition to your language will make ECRIRE
statements actually ambiguous. For example, you do not (currently) allow function calls. But at some point, you may well want to add functions to your language, and when you do so you might want to make it possible to call a function without assigning its return value to any variable. That would normally be done by just adding expression
to the list of possible instruction
s, but as soon as you do that, the semicolon in ECRIRE
becomes ambiguous, since it could be followed by an expression to be printed, or by an expression to be executed as the next statement.
If you're certain that this will never be an issue and that the grammar really will be unambiguous, as it is now, you could fix the problem by telling bison to produce a %glr-parser
. Although that complicates bison's job, it doesn't have much impact on you: you can continue to use the grammar in the same way, as long as it is not ambiguous.
But if you think you might want to introduce expression statements at some point, you might be better off changing the syntax of your language. If you used ,
to separate arguments to ECRIRE
and reserved ;
to separate statements (which is a pretty common approach), then the two delimiters will not interfere with each other and the grammar will be automatically unambiguous.