swiftregexnsregularexpression

regex works in online tool but doesn't agree with NSRegularExpression


do {
    // initialization failed, looks like I can not use "\\" here
    let regex = try NSRegularExpression.init(pattern: "(?<!\\)\n")

    let string = """
    aaabbb
    zzz
    """
    
    // expect "aaabbb\nzzz"
    print(regex.stringByReplacingMatches(in: string, options: [], range: NSMakeRange(0, string.count), withTemplate: "\\n"))
} catch let error {
    print(error)
}

Here I want to replace "\n" in my string with "\\n", but failed at the very beginning, the error message is

// NSRegularExpression did not recognize the pattern correctly.
Error Domain=NSCocoaErrorDomain Code=2048 "The value “(?<!\)
” is invalid." UserInfo={NSInvalidValue=(?<!\)
}

The regex has been tested in regular expression 101, so it is right, just doesn't work in Swift for some reason.

How can I do this?


Solution

  • Base on Larme's comment:

    in Swift, \ (double back slash) in a String is for "having a ``, as you see in the error, you have (?<!\), but it means then that you are escaping the closing ), so you have a missing closing ). I'd say that you should write then "(?<!\\\\)\n"?

    I finally figured out what's going on and how to fix it.

    The problem is backslash.

    In Swift, a backslash inside double quotation mark would be treated as escape sequence, like this

    // won't compile
    // error: Invalid escape sequence in literal
    let regex = try NSRegularExpression.init(pattern: "(?<!\)\n")
    

    If we add another backslash, is it work?

    No, cause these 2 backslashes would be treated as a single escape character for the upcoming closing ).

    // compile but get a runtime error
    let regex = try NSRegularExpression.init(pattern: "(?<!\\)\n")
    

    Hence the runtime error

    NSRegularExpression did not recognize the pattern correctly.
    Error Domain=NSCocoaErrorDomain Code=2048 "The value “(?<!\)
    ” is invalid." UserInfo={NSInvalidValue=(?<!\)
    

    To show that what we need is a literal backslash, we actually need 4 backslashes

    let regex = try NSRegularExpression.init(pattern: "(?<!\\\\)\n")
    

    The first two backslashes represent an escape character and the last two represent one literal backslash.

    These seem very troublesome.

    Better Approach

    Fortunately, starting with Swift 5, we can use a pair of # to do this

    // works like in online tool
    let regex = try NSRegularExpression.init(pattern: #"(?<!\\)\n"#)
    

    Another thing

    It’s worth noticing that the initialization of regular expression isn’t the only thing that requires special handling

    // withTemplate
    print(regex.stringByReplacingMatches(in: string, options: [], range: NSMakeRange(0, string.count), withTemplate: #"\\n"#))
    
    // As a comparison, this is OK
    print(string.replacingOccurrences(of: "\n", with: "\\N"))