As a pet project, I'm trying to make a groff parser with Jison ( a JavaScript clone of Bison ), but I'm struggling my head trying to figure out if groff's grammar is LALR(1).
Does anyone have an insight about this?.
Thanks in advance.
Update 1
In response to Brian concerns, here are more details about my problem:
Groff is written in C++ and does not use Bison, I'm deriving the grammar myself.
I've uploaded all my progress here
Most of the work parsing troff is lexical, although you could make use of a parser to evaluate arithmetic expressions. The "grammar" is otherwise just a question of identifying control lines and splitting them into arguments (again, essentially lexical).
If you intend to implement the controls which modify control and escape characters (.cc
, .c2
, .ec
and .eo
), then you will find precompiled regular expressions to be awkward, although the workaround for control characters is not awful.
I think I'd be inclined to restrict use of jison to pieces of the language like arithmetic expressions.
Of course, jison would come in handy for preprocessors like eqn
, in case that is in your plans.