I'm using Antlr4 to create an interpreter, lexer and parser. The GUI it will be used in contains QScintilla2.
As QScintilla does not need a parser and has a CustomLexer module will the (Antlr4 built, Python3 target) interpreter be enough?
I'm not asking for opinions but factual guidance. Thanks.
What does an Interpreter contain
An interpreter must have some way to parse the code and then some way to run it. Usually the "way to parse the code" would be handled by a lexer+parser, but lexerless parsing is also possible. Either way, the parser will create some intermediate representation of the code such as a tree or bytecode. The "way to run it" will then be a phase that iterates over the generated tree or bytecode and executes it. JIT-compilation (i.e. generating machine code from the tree or bytecode and then executing that) is also possible, but more advanced. You can also run various analyses between parsing and execution (for example you can check whether any undefined variables or used anywhere or you could do static type checking - though the latter is uncommon in interpreted languages).
When using ANTLR, ANTLR will generate a lexer and parser for you, the latter of which will produce a parse tree as a result, which you can iterate over using the generated listener or visitor. At that point you proceed as you see fit with your own code. For example, you could generate bytecode from the parse tree and execute that, translate the parse tree to a simplified tree and execute that or execute the parse tree directly in a visitor.
QScintilla is about displaying the language and is not linked to the interpreter. In an IDE the console is where the interpreter comes into play along with running the script (from a 'Run' button for example). The only thing which is common to QScintilla and the interpreter is the script file - the interpreter is not connected or linked to QScintilla. Does this make basic sense?
Yes, that makes sense, but it doesn't have to be entirely like that. That is, it can make sense to reuse certain parts of your interpreter to implement certain features in your editor/IDE, but you don't have to.
You've specifically mentioned the "Run" button and as far as that is concerned, the implementation of the interpreter (and whether or not it uses ANTLR) is of absolutely no concern. In fact it doesn't even matter which language the interpreter is written in. If your interpreter is named mylangi
and you're currently editing a file named foo.mylang
, then hitting the "Run" button should simply execute subprocess.run(["mylangi", "foo.mylang"])
and display the result in some kind of tab or window.
Same if you want to have a "console" or "REPL" window where you can interact with the interpreter: You simply invoke the interpreter as a subprocess and connect it to the tab or subwindow that displays the console. Again the implementation of the interpreter is irrelevant for this - you treat it like any other command line application.
Now other features that IDEs and code editors have are syntax highlighting, auto-completion and error highlighting.
For syntax highlighting you need some code that goes through the source and tells the editor which parts of the code should have which color (or boldness etc.). Using QScintilla, you accomplish this by giving a lexer class that does this. You can define such a class, by simply writing the necessary code to detect the types of tokens by hand, but you can also re-use the lexer generated by ANTLR. So that's one way in which the implementation of your interpreter could be re-used in the editor/IDE. However since a syntax highlighter is usually fairly straight forward to write by hand, you don't have to do it this way.
For code completion you need to understand which variables and functions are defined in the file, what their scope is, and which other files are included in the current file. These days it's becoming common to implement this logic in a so-called language-server that is separate tool that can be re-used from different editors and IDEs. Regardless of whether you implement this logic in such a language server or directly in your editor, you'll need a parser (and, if applicable, a type checker) to be able to answer these types of question. Again that's something that you can re-use from your interpreter and this time that's definitely a good idea because writing a second parser would be significant additional work (and easy to get out of sync with the interpreter's parser).
For error highlighting you can simply invoke the interpreter in "verify only" mode (i.e. only print out syntax errors and other errors that can be detected statically, but don't actually run the file -- many interpreters have such an option) and then parse the output to find out where to draw the squiggly lines. But you can also re-use the parser (and analyses if you have any) from your interpreter instead. If you go the route of having a language server, errors and warnings would also be handled by the language server.