I'm using ANTLR to define some simple programming language. At the moment it only supports method definitions at the root level. The largely simplified grammer looks like:
grammer MyLang;
root: declaration* ;
declaration: Def ..<snip/snap>.. End;
Def: 'def';
End: 'end';
...
My parsing code looks similar to this:
var charStream = CharStreams.fromString(s)
var lexer = new MyLangLexer(charStream);
var tokenStream = new CommonTokenStream(lexer);
var parser = new MyLangParser(tokenStream);
var rootContext = parser.root();
var astFactory = new Parser();
var declarations = astFactory.visitRoot(rootContext);
which works fine for good input like this one:
def foobar()
command1
end
def bazz()
command2
command3
end
However, if I provide bad input, e.g. omit the def
foobar()
command1
end
I'm silently getting nothing, no exception, no result (or, if the missing def
has preceding good declarations, the parser just silently stops).
How can I make ANTLR fail, e.g. with an exception, if it can't find a matching parser rule for the actual lexer tokens? Or can I find out about the last successfully parsed token (to verify whether it is EOF)?
Your idea of trying to check that the last matched token is EOF, is a good one, but instead of doing this after the fact, you should do it in the grammar, by changing your start rule to root: declaration* EOF;
.
Once you've done that, ANTLR will treat anything it can't parse as errors. By default, this will cause it to print syntax errors to stderr while still continuing the parse and producing a syntax tree with missing pieces. You'll probably want to register your own error listener, so you can handle errors yourself.