[SOLVED] Understanding JavaParser compared to JavaCC and Eclipse JDT

That's by no means an exhaustive answer, just a bit of clarification on the specific part of your questions and my 5 cents on the more general one. I assume, that you want to analyze Java code.

I also assume that it is sort of exercise in using code-as-data and grammars/parsers. Otherwise the field of code analysis itself is huge with very specific niches like finding bugs or checking code for thread safety, say.

In general, there's a huge amount of tools available for the purpose, but if we limit them to those written in Java the biggest fish in the open source space seem to be covered here. For a more complete list see this blog from some of the authors of JavaParser and this for a general introduction to the topic. It may also be worth it to have a look at their material on the somewhat overlapping topic of language development in general.

In an ex post view those question were lurking in the background of this response:

Do you need to parse in the first place? E.g. getting word or line counts won't need full blown parsing. Regex or a scanner (often the first stage in parsing) might do if you want to elicit all string constants or identifiers. They can't get at the nested structure of code, though.
Is full parsing needed or will a subset of the grammar do? Tools like comby will consider the nested structure of code out of the box glossing over the details.
Is it an interactive (IDE) setting with lots of feedback, editing support and continuous incremental compilation in the background needed?
Do you need to base operations on incomplete or (temporarily) broken code, like for e.g. code completion? That may also be reflected in the grammar you want to use.
Do you have to deal with stuff that goes beyond parsing, e.g. type checking?
Is it only about analysis or transformations also?
Whats the size of the code to handle in given time constraints? More generic tools won't give you the fastest possible processing.
Do you need a compact stand alone tool or can you live with a zoo of dependencies?
How well is the structure of the output suited to the intended operations on it? All java specific parsing tools mentioned will give you an abstract syntax tree (AST) for a given piece of code, but each AST will be different (will be discussed below).

Let's go from the specific to the general:

com.github.javaparser parses a static piece of java code (note: only java, only static) and gives you an AST. The package also has SymbolResolver, which tries to determine the Java type of symbols. Its called JavaParser, but it isn't just a parser, it supports Java streams for querying and comes with AST manipulation and code generation capabilities. A main backer is an Italian company btw.

Eclipse jdt is comparably huge, with org.eclipse.jdt.core.dom.ASTParser giving you an AST. But as opposed to JavaParser everything is geared towards handling Java (only) in an interactive development situation. Since Eclipse can perform refactorings, it must be able to analyze and manipulate the AST, here's an example for that (as part of this post) and here are comprehensive examples for the refactoring api. If you're building some Eclipse integrated functionality to support writing of code, that will be your first option anyway. Eclipse Jdt supports incremental compilation in some form which you need if you want some compile-on-the-fly-and-give-feedback-as-the-code-gets-typed functionality.

I also worked a bit with the spoon library (developed by a university in France) which has the same focus as JavaParser, also does symbol resolution but has different querying mechanisms. It builds on org.eclipse.jdt.core. Each of those tools will give you a different AST for the same java code reflecting their intended use case, spoon describes it like this:

A programming language can have different meta models. An abstract syntax tree (AST) or model, is an instance of a meta model. Each meta model – and consequently each AST – is more or less appropriate depending on the task at hand. For instance, the Java meta model of Sun’s compiler (javac) has been designed and optimized for compilation to bytecode, while, the main purpose of the Java meta model of the Eclipse IDE (JDT) is to support different tasks of software development in an integrated manner (code completion, quick fix of compilation errors, debug, etc.).

The most stark difference is between the more domain specific tools and the parser generators' generated parsers. While having some difference even between them, JavaParser/Spoon ASTs mirror the code on a conceptual level, you get methods, parameter lists, parameters and so on while the generated parsers give you every detail in the grammar down to semicolons, commas and braces as elements in the AST. I think, Eclipse has an Ast View where you can see JDT's parser output perhaps, but I'm not aware of a comprehensive tool that can show you differences between different parser for java like AstExplorer does it in the javascript world.

Which framework suits your need will depend very much on your use case. E.g. if you need symbol resolution, you're probably bound to those options that provide it anyway. I tried to get my feet wet with a Java transpiler and found the JavaParser metamodel more suitable than spoon's model and liked its small number of dependencies.

A general (though non-incremental) way to get a handle at an AST would be a parser generator like JavaCC (read: compiler compiler (aka compiler generator) written in Java that can create parsers for anything you have a grammar for) or ANTLR. If you want to parse SQL, you feed them a sql grammar, if you want to parse Java code, you feed them this one (ANTLR-format) or this one (JavaCC-format). The result will be a parser which can give you an AST for a given piece of code and a visitor class perhaps.

This approach gives you all possible control over the processing and the possibility to define or tweak a grammar depending on your needs, e.g. to introduce additional non-terminal nodes, trim it down to class/method-level only or pick out comments only without confusing them with string constants, if that's all you care about. You could also get at the structure of embedded non-Java code fragments, e.g. SQL query strings.

Btw. ANTLR can handle direct left recursion in the grammar, while JavaCC can't, e.g. for arithmetic expressions for binary operators like in exp := exp + exp

If your goal is to support developer activities as they write the code you'll have to deal with broken or incomplete code. Eclipse is build for the purpose and while I didn't use its jdt I'd expect it to handle such cases gracefully with reasonable feedback. Also ANTLR will recover from syntax errors if possible allowing you to define some error handling. I don't remember what spoon and JavaParser did in case of errors, I think, they expect syntactically correct code upfront.