regexvisual-studio-codevscode-extensionstmlanguage

How to use "!" as the comment indicator, but also NOT operator in language syntax highlight?


I am using VScode and create my own language extension to highlight syntax, where I need to use regular expression to find the comments.

The basic rule is that everything after ! is a comment, however there is a special case. When ! is inside eval() command, it means NOT.

For example some of my code would look like:

if condition=(eval(!DB_EXIST)) ! this is a comment
(eval( !DB_UPDATED && !DB_EXIST)) !---"!" inside eval() means NOT
!this is another comment
<some commands> ! this is also a comment

The !DB_EXIST in line 1 and 2 should not be interpreted as comments, and ! will be followed by a non-space.

Whitespace doesn't matter in comments.

"comments": {
    "patterns" [{
        "match":"regex1",
        "name":"comment"
    }]
},
"operator": {
    "patterns" [{
        "match":"regex2",
        "name":"keyword.operator.NOT"
    }]
},

What kind of regex 1 and 2 should I use to show different color for comments and NOT?

I am not good at this extension writing, so if there is any better way to do the job I will be very appreciated to hear. Thanks!

Update

@Gama11 helped me but I didn't completely cover all the case in my code samples. Any non-sapce after "!" should also be comments, as long as "!" is not inside eval().


Solution

  • Here's one way to do it:

    {
        "$schema": "https://raw.githubusercontent.com/Septh/tmlanguage/master/tmLanguage.schema.json",
        "scopeName": "source.abc",
        "patterns": [
            {
                "begin": "(eval)\\(",
                "end": "\\)",
                "captures": {
                    "1": {
                        "name": "entity.name.function"
                    }
                },
                "patterns": [
                    {
                        "include": "#parens"
                    },
                    {
                        "match": "!",
                        "name": "keyword"
                    }
                ]
            },
            {
                "match": "!.*?$",
                "name": "comment"
            }
        ],
        "repository": {
            "parens": {
                "begin": "\\(",
                "end": "\\)",
                "patterns": [
                    {
                        "include": "#parens"
                    }
                ]
            }
        }
    }
    

    We put the pattern for the non-comment ! first, since it's more specific and should have priority over the other one. Also, I used the "keyword" scope instead of the more appropriate "keyword.operator.NOT" so it actually shows a different color in the screenshot.

    The first regex is a begin-end pattern, which allows us to apply patterns only for the text between those two matches (in this case within an eval() calll). While we're at it, we might as well highlight eval as a function with the entity.name.function scope.

    If we're within a eval(), we allow two patterns:

    The second regex simply matches !, and then anything (.*?) until the end of the line ($).

    In general, I'd highly recommend using a tool like regex101.com for playing around with the regexes of TM Grammar files. Much easier than iterating in VSCode itself since you get instant feedback.