I'm trying to create grammar for GNU MathProg language from glpk package https://www3.nd.edu/~jeff/mathprog/glpk-4.47/doc/gmpl.pdf
Unfortunately grammar I've written so far is ambiguous.
I don't know how to tell bison which branch of parsing tree is correct when some identifier is used. For example:
numericExpression : numericLiteral
| identifier
| numericFunctionReference
| iteratedNumericExpression
| conditionalNumericExpression
| '(' numericExpression ')' %prec PARENTH
| '-' numericExpression %prec UNARY
| '+' numericExpression %prec UNARY
| numericExpression binaryArithmeticOperator numericExpression
;
symbolicExpression : stringLiteral
| symbolicFunctionReference
| identifier
| conditionalSymbolicExpression
| '(' symbolicExpression ')' %prec PARENTH
| symbolicExpression '&' symbolicExpression
;
indexingExpression : '{' indexingEntries '}'
| '{' indexingEntries ':' logicalExpression '}'
;
setExpression : literalSet
| identifier
| aritmeticSet
| indexingExpression
| iteratedSetExpression
| conditionalSetExpression
| '(' setExpression ')' %prec PARENTH
| setExpression setOperator setExpression
;
numericLiteral : INT
| FLT
;
linearExpression : identifier
| iteratedLinearExpression
| conditionalLinearExpression
| '(' linearExpression ')' %prec PARENTH
| '-' linearExpression %prec UNARY
| '+' linearExpression %prec UNARY
| linearExpression '+' linearExpression
| linearExpression '-' linearExpression
| linearExpression '*' numericExpression
| numericExpression '*' linearExpression
| linearExpression '/' numericExpression
;
logicalExpression : numericExpression
| relationalExpression
| iteratedLogicalExpression
| '(' logicalExpression ')' %prec PARENTH
| NOT logicalExpression %prec NEG
| logicalExpression AND logicalExpression
| logicalExpression OR logicalExpression
;
identifier : SYMBOLIC_NAME
| SYMBOLIC_NAME '[' listOfIndices ']'
;
listOfIndices : SYMBOLIC_NAME
| listOfIndices ',' SYMBOLIC_NAME
;
Identifier is simply name of 'variable'. Variable has a specific type (parameter, set, decision variable) and might be indexed. In code programmer has to declare variable type in statements like eg.
param p1;
param p2{1, 2} >=0;
set s1;
set s2{i in 1..5};
var v1 >=0;
var v2{S1,S2};
But when bison sees identifier doesn't know which rule should use and i'm getting reduce/reduce conflicts like
113 numericExpression: identifier .
123 symbolicExpression: identifier .
'&' reduce using rule 123 (symbolicExpression)
ELSE reduce using rule 113 (numericExpression)
ELSE [reduce using rule 123 (symbolicExpression)]
INTEGER reduce using rule 113 (numericExpression)
INTEGER [reduce using rule 123 (symbolicExpression)]
BINARY reduce using rule 113 (numericExpression)
BINARY [reduce using rule 123 (symbolicExpression)]
ASIGN reduce using rule 113 (numericExpression)
ASIGN [reduce using rule 123 (symbolicExpression)]
',' reduce using rule 113 (numericExpression)
',' [reduce using rule 123 (symbolicExpression)]
'>' reduce using rule 113 (numericExpression)
'>' [reduce using rule 123 (symbolicExpression)]
'}' reduce using rule 113 (numericExpression)
'}' [reduce using rule 123 (symbolicExpression)]
113 numericExpression: identifier .
123 symbolicExpression: identifier .
130 setExpression: identifier .
UNION reduce using rule 130 (setExpression)
DIFF reduce using rule 130 (setExpression)
SYMDIFF reduce using rule 130 (setExpression)
ELSE reduce using rule 113 (numericExpression)
ELSE [reduce using rule 123 (symbolicExpression)]
ELSE [reduce using rule 130 (setExpression)]
WITHIN reduce using rule 130 (setExpression)
IN reduce using rule 113 (numericExpression)
IN [reduce using rule 123 (symbolicExpression)]
I have also other problems but this one is blocker for me
Basically the problem is that identifier
appears in multiple rules:
numericExpression : identifier
symbolicExpression : identifier
setExpression: identifier
which may apply in the same context. One way to resolve this is to introduce different token types for set names and scalar (parameters and variables) names:
symbolicExpression : SCALAR_NAME
setExpression: SET_NAME
This will resolve conflict with set names. I don't think that you need this rule
numericExpression : identifier
because there is an automatic conversion from strings to numbers in AMPL and therefore MathProg, which is a subset of AMPL, so symbolicExpression
should be allowed in the context of numericExpression
.
Note that the grammar is not context-free, so you'll need to pull additional information like the name category above from the symbol table to resolve problems like this.
My flex is a bit rusty but I think you can do something like that in an identifier rule:
return is_setname(...) ? TOK_SET_NAME : TOK_SCALAR_NAME;
where is_setname
is a function to check whether the current identifier is a set. You'll need to define such function and get the necessary information from a symbol table.