I'm just getting started with regular expressions and Swift Regex, so a heads up that my terminology my be incorrect. I have boiled this problem down to a very simple task:
I have input lines that have either just one word (a name) or start with the word "Test" followed by one space and then a name. I want to extract the name and also be able to access - without using match indices - the match to "Test " (which may be nil). Here is code that better describes the problem:
import RegexBuilder
let line1 = "Test John"
let line2 = "Robert"
let nameReference = Reference(String.self)
let testReference = Reference(String.self)
let regex = Regex {
Optionally {
Capture(as:testReference) {
"Test "
} transform : { text in
String(text)
}
}
Capture(as:nameReference) {
OneOrMore(.any)
} transform : { text in
String(text)
}
}
if let matches = try? regex.wholeMatch(in: line1) { // USE line1 OR line2 HERE
let theName = matches[nameReference]
print("Name is \(theName)")
// using index to access the test flag works fine for both line1 and line2:
if let flag = matches.1, flag == "Test " {
print("Using index: This is a test line")
} else {
print("Using index: Not a test line")
}
// but for line2, attempting to access with testReference crashes:
if matches[testReference] == "Test " { // crashes for line2 (not surprisingly)
print("Using reference: This is a test line")
} else {
print("Using reference: Not a test line")
}
}
When regex.wholeMatch() is called with line1
things work as expected with output:
Name is John
Using index: This is a test line
Using reference: This is a test line
but when called with line2
it crashes with a SIGABRT and output:
Name is Robert
Using index: Not a test line
Could not cast value of type 'Swift.Optional<Swift.Substring>' (0x7ff84bf06f20) to 'Swift.String' (0x7ff84ba6e918).
The crash is not surprising, because the Capture(as:testReference)
was never matched.
My question is: is there a way to do this without using match indices (matches.1
)? An answer using Regex Builder would be much appreciated:-)
The documentation says Regex.Match
has a subscript(String)
method which "returns nil if there's no capture with that name". That would be ideal, but it works only when the match output is type AnyRegexOutput
.
While I would prefer Tom Harrington's solution for this particular use case, the API supports optional references by setting the type of the reference to an Optional
itself:
let nameReference = Reference(String.self)
let testReference = Reference(String?.self) // The String? is crucial here
let regex = Regex {
Optionally {
Capture(as:testReference) {
"Test "
} transform : { text in
String(text)
}
}
Capture(as:nameReference) {
OneOrMore(.any)
} transform : { text in
String(text)
}
}
if let matches = try? regex.wholeMatch(in: line1) {
if matches[testReference] == "Test " { // this does not cash, but returns a String?
print("Using reference: This is a test line")
} else {
print("Using reference: Not a test line")
}
}
Note: if you want to have a reference to an optional Substring (Reference(Substring?.self)
), then you must use Capture(as:_:transform:)
, because otherwise the compiler complains that Substring?
and Substring
are not equivalent.