regexscalapattern-matching

Pattern Matching Scala Regex evaluation


Imagine you have String that contains the Ampersand Symbol & my goal is to add spaces between the & and any character if there isn't any

e.x

Case 1: Body&Soul should be-->Body & Soul (working)
Case 2: Body  &Soul--> Body  & Soul (working)
Case 3: Body&  Soul -->Body &  Soul (working)
Case 4: Body&Soul&Mind -->Body & Soul & Mind (working)

Case 5: Body &Soul& Mind ---> Body & Soul & Mind (not working)
Case 6: Body& Soul &Mind ---> Body & Soul & Mind (not working)
    def replaceEmployerNameContainingAmpersand(emplName: String): String 
    = {
    val r = "(?<! )&(?! )".r.unanchored
    val r2 = "&(?! )".r.unanchored
    val r3 = "(?<! )&".r.unanchored

    emplName match {
     case r() => emplName.replaceAll("(?<! )&(?! )", " & ")

     case r2() => emplName.replaceAll("&(?! )", "& ")

     case r3() => emplName.replaceAll("(?<! )&", " &")
    }
   }

The goal is to fix Case 5 & 6: Body &Soul& Mind or Body& Soul &Mind --> Body & Soul & Mind

But it's not working because when case 2 or 3 occurs the case is exiting and not matching the second & symbol.

Can anyone help me on how to match case 5 and 6?


Solution

  • You may capture a single optional whitespace char on both ends of a & and check if they matched, and replace accordingly using replaceAllIn:

    def replaceAllIn(target: CharSequence, replacer: (Match) => String): String
    Replaces all matches using a replacer function.

    See the Scala demo:

    val s = "Body&Soul, Body  &Soul, Body&  Soul, Body&Soul&Mind, Body &Soul& Mind, Body& Soul &Mind"
    val pattern = """(\s)?&(\s)?""".r
    val res = pattern.replaceAllIn(s, m => (if (m.group(1) != null) m.group(1) else " ") + "&" + (if (m.group(2) != null) m.group(2) else " ") )
    println(res)
    // => Body & Soul, Body  & Soul, Body &  Soul, Body & Soul & Mind, Body & Soul & Mind, Body & Soul & Mind
    

    The (\s)?&(\s)? pattern matches and captures into Group 1 a single whitespace char, then matches &, and then captures an optional whitespace in Group 2.

    If Group 1 is not null, there is a whitespace, and we keep it, else, replace with a space. The same logic is used for the trailing space.