Hi i am writing a Parser for a Programming language my university uses, with jflex and Cup I started with just the first basic structures such as Processes an Variable Declarations.
I get the following Errors
Warning : *** Shift/Reduce conflict found in state #4
between vardecls ::= (*)
and vardecl ::= (*) IDENT COLON vartyp SEMI
and vardecl ::= (*) IDENT COLON vartyp EQEQ INT SEMI
under symbol IDENT
Resolved in favor of shifting.
Warning : *** Shift/Reduce conflict found in state #2
between vardecls ::= (*)
and vardecl ::= (*) IDENT COLON vartyp SEMI
and vardecl ::= (*) IDENT COLON vartyp EQEQ INT SEMI
under symbol IDENT
Resolved in favor of shifting.
My Code in Cup looks like this :
non terminal programm;
non terminal programmtype;
non terminal vardecl;
non terminal vardecls;
non terminal processdecl;
non terminal processdecls;
non terminal vartyp;
programm ::= programmtype:pt vardecls:vd processdecls:pd
{: RESULT = new SolutionNode(pt, vd, pd); :} ;
programmtype ::= IDENT:v
{: RESULT = ProblemType.KA; :} ;
vardecls ::= vardecl:v1 vardecls:v2
{: v2.add(v1);
RESULT = v2; :}
|
{: ArrayList<VarDecl> list = new ArrayList<VarDecl>() ;
RESULT = list; :}
;
vardecl ::= IDENT:id COLON vartyp:vt SEMI
{: RESULT = new VarDecl(id, vt); :}
| IDENT:id COLON vartyp:vt EQEQ INT:i1 SEMI
{: RESULT = new VarDecl(id, vt, i1); :}
;
vartyp ::= INTEGER
{: RESULT = VarType.Integer ; :}
;
processdecls ::= processdecl:v1 processdecls:v2
{: v2.add(v1);
RESULT = v2; :}
| {: ArrayList<ProcessDecl> list = new ArrayList<ProcessDecl>() ;
RESULT = list; :}
;
processdecl ::= IDENT:id COLON PROCESS vardecls:vd BEGIN END SEMI
{: RESULT = new ProcessDecl(id, vd); :}
;
I Guess i get the Errors because the Process Declaration and the VariableDeclaration both start with Identifiers then a ":" and then either the Terminal PROCESS or a Terminal like INTEGER. If so i'd like to know how i can tell my Parser to look ahead a bit more. Or whatever Solution is possible.
Thanks for your answers.
Your diagnosis is absolutely correct. Because the parser cannot know whether IDENT
starts a processdecl
or a vardecl
without two more lookahead tokens, it cannot know when it has just reduced a vardecl
and is looking at an IDENT
whether it is about to see another vardecl
or a processdecl
.
In the first case, it must just shift the IDENT
as part of the following vardecl
. In the second case, it needs to first reduce an empty vardecls
and then successively reduce vardecls
until it has constructed the complete list.
To get rid of the shift reduce conflict, you need to simplify the parser's decision-making.
The simplest solution is to allow the parser to accept declarations in any order. Then you end up with something like this:
program ::= program_type declaration_list ;
declaration_list ::=
var_declaration declaration_list
| process_declaration declaration_list
|
;
var_declaration_list ::=
var_declaration var_declaration_list
|
;
process_declaration ::=
IDENT:id COLON PROCESS var_declaration_list BEGIN END SEMI ;
(Personally, I'd make the declaration lists left-recursive rather than right-recursive, but it depends whether you prefer to append or prepend in the list's action. Left-recursion uses less parser stack.)
If you really want to insist that all variable declarations come before any process declaration, you can check for that in the action for declaration_list
.
Alternatively, you can start by making both types of declaration list left-recursive instead of right recursive. That will almost work, but it will still generate a shift-reduce conflict in the same state as the original grammar, this time because it needs to reduce an empty process declaration list before the first process declaration can be reduced.
Fortunately, that's easier to work around. If the process declaration list cannot be empty, there is no problem, so it's just a question of rearranging the productions:
program ::= program_type var_declaration_list process_declaration_list
| program_type var_declaration_list
;
var_declaration_list ::=
var_declaration var_declaration_list
|
;
process_declaration_list ::=
process_declaration_list process_declaration
| process_declaration
;
Finally, an ugly but possible alternative is to make the variable declaration list left-recursive and the process declaration list right-recursive. In that case, there is no empty production between the last variable declaration and the first process declaration.