I would like to parse a set of expressions: R[3]C
, R[2]C
, R[3]C-R[2]C
... There is a conflict I cannot solve...
Here is a part of lexer.mll
:
rule token = parse
| 'R' { R }
| 'C' { C }
| "RC" { RC }
| ['0'-'9']+ as lxm { INTEGER (int_of_string lxm) }
| '+' { PLUS }
| '-' { MINUS }
| '[' { LBRACKET }
| ']' { RBRACKET }
| eof { EOF }
...
A part of parser.mly
:
main:
e_expression EOF { $1 };
e_expression:
| ec = e_cell { EE_rc (Rc.Cell ec) }
| e_expression MINUS e_expression { EE_string_EEL ("MINUS", [$1; $3]) }
e_cell:
| R LBRACKET r = index RBRACKET C c = index { (Rc.I_relative r, Rc.I_absolute c) }
| R LBRACKET r = index RBRACKET C { (Rc.I_relative r, Rc.I_relative 0) }
index:
| INTEGER { $1 }
| MINUS INTEGER { Printf.printf "%n\n" 8; 0 - $2 }
This code curiously does not work with R[3]C-R[2]C
, here is parser.conflicts, that I can not really understand.
If I comment the line line | R LBRACKET r = index RBRACKET C c = index ...
in e_cell
, the code can parse R[3]C-R[2]C
, where 3
and 2
are index
, `R[3]C
and R[2]C
are e_cell
, and R[3]C-R[2]C
is e_expression
.
Could anyone help?
So the problem seems to be that when it sees a "-" token after a ], the parser is unsure whether it is making an index, or if it is separating two expressions.
i.e. When the parser reaches R[3]C-, it isn't sure whether it needs to wait for an INTEGER to complete the e_cell and reduce, or reduce now and start work on another e_expression.
The best way to solve this would probably be to move the negative integer code into the lexer. I don't have an ocamllex install handy, but I think changing
['0'-'9']+
to
'-'? ['0'-'9']+
would work, and then remove the negative integer case from index (obviously this would cause a problem with the Printf statement, but you can make the internal logic more complex to account for this.