syntaxvisual-studio-codetokenizevscode-extensionstmlanguage

How to embed a syntax object in another in TextMate language definitions, tmLanguage


I am trying to support Clojure's ignore text form, #_, (a sort of comment) in VS Code, which uses tmLanguage for its grammar definitions. Since it is common to disable a block of code using #_, I want the disabled block of code to retain its syntax highlighting and just italicize it, indicating its status.

But my lack of skills using tmLanguage seems to stop me. This is one of the failing attempts (a snippet of the cson):

  'comment-constants':
    'begin': '#_\\s*(?=\'?#?[^\\s\',"\\(\\)\\[\\]\\{\\}]+)'
    'beginCaptures':
      '0':
        'name': 'punctuation.definition.comment.begin.clojure'
    'end': '(?=[\\s\',"\\(\\)\\[\\]\\{\\}])'
    'name': 'meta.comment-expression.clojure'
    'patterns':
      [
        {
          'include': '#constants'
        }
      ]

With constants defining some Clojure constants objects, like keyword:

  'keyword':
    'match': '(?<=(\\s|\\(|\\[|\\{)):[\\w\\#\\.\\-\\_\\:\\+\\=\\>\\<\\/\\!\\?\\*]+(?=(\\s|\\)|\\]|\\}|\\,))'
    'name': 'constant.keyword.clojure'

What I want to happen is that the constants definitions will be used ”inside” the comment. For keywords I have this (failing) spec:

  it "tokenizes keywords", ->
    tests =
      "meta.expression.clojure": ["(:foo)"]
      "meta.map.clojure": ["{:foo}"]
      "meta.vector.clojure": ["[:foo]"]
      "meta.quoted-expression.clojure": ["'(:foo)", "`(:foo)"]
      "meta.comment-expression.clojure": ["#_:foo"]

    for metaScope, lines of tests
      for line in lines
        {tokens} = grammar.tokenizeLine line
        expect(tokens[1]).toEqual value: ":foo", scopes: ["source.clojure", metaScope, "constant.keyword.clojure"]

(The last test in that list). It fails with this message:

Expected
{  value : ':foo',
  scopes : [ 'source.clojure', 'meta.comment-expression.clojure' ] }
to equal
{  value : ':foo',
  scopes : [ 'source.clojure', 'meta.comment-expression.clojure', 'constant.keyword.clojure' ] }.

Meaning I am not getting the constant.keyword.clojure scope in place and thus no keyword-colorization for me. 😢

Anyone knows how to do this?


Solution

  • Your keyword regex starts with a lookbehind that requires that there must be a single whitespace, (, [ or { character before keywords. The _ from #_ doesn't meet that requirement.

    (?<=(\\s|\\(|\\[|\\{))
    

    You could simply add _ to the list of allowed characters:

    (?<=(\\s|\\(|\\[|\\{|_))
    

    Note that this still wouldn't work as-is for your "#_:foo" test case because of the similar lookahead at the end. You could possibly allow $ there, make the match optional, or change the test case.