regexvisual-studio-codevscode-extensionstmlanguage

Syntax highlighting for multiline string literals


I have a custom language for which I want to provide syntax highlighting in Visual Studio Code.

In this language string literals start and end either with a double or a single quote. A string literal starting with a double quote may contain single quotes and vice versa. Any other escaping of the quotes doesn't occur. But regardless of the starting quote type the literals may span multiple lines.

In my tmLanguage.json file I tried these regular expressions:

{
    "match": "'[^']*'",
    "name": "string.quoted.single.rss"
},
{
    "match": "\"[^\"]*\"",
    "name": "string.quoted.double.rss"
}

Unfortunately this only works for single line literals. I then tried

{
    "match": "(?s)'[^']*'",
    "name": "string.quoted.single.rss"
},
{
    "match": "(?s)\"[^\"]*\"",
    "name": "string.quoted.double.rss"
}

But with the (?s) modifier the syntax highlighting doesn't work at all.

Is there a way to match multi line strings?


Solution

  • Try to use a begin / end pattern instead of a simple match. The Haxe language also has multiline string literals, and it matches Strings like this:

    strings:
      patterns:
      - begin: '"'
        beginCaptures:
          '0': {name: punctuation.definition.string.begin.hx}
        end: '"'
        endCaptures:
          '0': {name: punctuation.definition.string.end.hx}
        name: string.quoted.double.hx
        patterns:
        - include: '#string-escape-sequences'
    

    The grammar uses YAML instead of JSON to avoid having to escape regexes, but it should be fairly straighforward to translate over. The source for the snippet is here.