Can a parser, generated by tree-sitter, be used both for both syntax highlighting and compiler itself? If not - why?
It would be counterproductive to write 2 different parsers and maintain them.
Note: I haven't used tree-sitter yet, but consider using it for highlighting syntax of my own programming language. Due-to that, I may misunderstand how it's parser actually works.
Quoting the answer from https://github.com/tree-sitter/tree-sitter/discussions/831:
I think the biggest downside to using a Tree-sitter parser in a compiler front-end is that, while we've done a lot of work on Tree-sitter's error recovery, we haven't yet built out functionality for error messages. So it isn't trivial to find out the exact token/position where the error initiated, and get a list of expected tokens, and things like that.
Also, the error recovery currently isn't customizable in domain-specific ways (e.g. as soon as the word "function" appears, assume that the user meant to write an entire function definition).
Down the road, I would love to invest in both of these things, but because there's so much other stuff we're working on, it may be a while before this happens.
I managed to use a tree-sitter parser for a toy language to implement an interpreter in Rust: https://github.com/sgraf812/tree-sitter-lambda/blob/35fe05520e806548dedb48e7f97118847b531b26/src/main.rs
Having done that, I can't recommend it:
bison
and C). This means you have to switch over Node::kind
, a string. Inefficient and incomplete matches everywhere.ut8_text
.I have a feeling that tree-sitter is best in class only when you don't need a typed overlay of the syntax tree.
See also https://github.com/tree-sitter/tree-sitter/discussions/831#discussioncomment-5797368 for another experience report.