swiftswift-regexbuilder

Access an optional capture by name when using Swift Regex Builder


I'm just getting started with regular expressions and Swift Regex, so a heads up that my terminology my be incorrect. I have boiled this problem down to a very simple task:

I have input lines that have either just one word (a name) or start with the word "Test" followed by one space and then a name. I want to extract the name and also be able to access - without using match indices - the match to "Test " (which may be nil). Here is code that better describes the problem:

import RegexBuilder

let line1 = "Test John"
let line2 = "Robert"

let nameReference = Reference(String.self)
let testReference = Reference(String.self)

let regex = Regex {
    Optionally {
        Capture(as:testReference) {
            "Test "
        } transform : { text in
            String(text)
        }
    }
    Capture(as:nameReference) {
        OneOrMore(.any)
    } transform : { text in
        String(text)
    }
}

if let matches = try? regex.wholeMatch(in: line1) { // USE line1 OR line2 HERE
    let theName = matches[nameReference]
    print("Name is \(theName)")
    // using index to access the test flag works fine for both line1 and line2:
    if let flag = matches.1, flag == "Test " {
        print("Using index: This is a test line")
    } else {
        print("Using index: Not a test line")
    }
    // but for line2, attempting to access with testReference crashes:
    if matches[testReference] == "Test " { // crashes for line2 (not surprisingly)
        print("Using reference: This is a test line")
    } else {
        print("Using reference: Not a test line")
    }
}

When regex.wholeMatch() is called with line1 things work as expected with output:

Name is John
Using index: This is a test line
Using reference: This is a test line

but when called with line2 it crashes with a SIGABRT and output:

Name is Robert
Using index: Not a test line
Could not cast value of type 'Swift.Optional<Swift.Substring>' (0x7ff84bf06f20) to 'Swift.String' (0x7ff84ba6e918).

The crash is not surprising, because the Capture(as:testReference) was never matched.

My question is: is there a way to do this without using match indices (matches.1)? An answer using Regex Builder would be much appreciated:-)

The documentation says Regex.Match has a subscript(String) method which "returns nil if there's no capture with that name". That would be ideal, but it works only when the match output is type AnyRegexOutput.


Solution

  • While I would prefer Tom Harrington's solution for this particular use case, the API supports optional references by setting the type of the reference to an Optional itself:

    let nameReference = Reference(String.self)
    let testReference = Reference(String?.self)  // The String? is crucial here
    
    let regex = Regex {
        Optionally {
            Capture(as:testReference) {
                "Test "
            } transform : { text in
                String(text)
            }
        }
        Capture(as:nameReference) {
            OneOrMore(.any)
        } transform : { text in
            String(text)
        }
    }
    
    if let matches = try? regex.wholeMatch(in: line1) {
        if matches[testReference] == "Test " { // this does not cash, but returns a String?
            print("Using reference: This is a test line")
        } else {
            print("Using reference: Not a test line")
        }
    }
    

    Note: if you want to have a reference to an optional Substring (Reference(Substring?.self)), then you must use Capture(as:_:transform:), because otherwise the compiler complains that Substring? and Substring are not equivalent.