javaregexstring

How to extract a substring from a string by removing a specified word that can appear anywhere?


I have a string, for example, "John Doe", where one word is a known keyword, and the other is the remaining part that I need to extract. Given a string like "John Doe", I know the keyword, e.g., "Doe", and I want to extract the other word, "John".

I initially tried this approach:

String input = "John Doe";
String keyword = "Doe";
String result = input.replace(keyword, "").trim();

This works fine if the keyword is at the end of the string. However, it doesn't handle cases where the keyword might be at the beginning or somewhere in the middle. For example, if the input is "Doe John" or if the keyword is part of another word, like "Johnathan", it fails. In the case of "Johnathan Doe", using "John" as the keyword incorrectly returns "athan Doe".

I need a way to reliably extract the remaining part of the string when the keyword is known, regardless of its position in the string, and to treat the keyword as a whole word.

I also tried this:

String input = "John Doe";
String receiver = "Doe";
String[] parts = input.split("\\b" + receiver + "\\b");
String result = "";
if (parts.length > 1 && !parts[1].trim().isEmpty()) {
    sender = parts[1].trim();
}
else if (parts.length > 0 && !parts[0].trim().isEmpty()) {
    sender = parts[0].trim();
}   

But it's a long way. Sure there's a shorter way to do this


Solution

  • You apparently don't want to extract anything. What you want is to simply delete a word and leave the remainder. I would try something like this. The regex caters to a keyword surrounded by blanks or a keyword at the beginning of the string or at the end.

    String[] s = {"Johnathan", "John Doe"};
     String keyWord = "John";
     for (String str : s) {
        str = str.replaceAll("(?:^|\\s+)"+keyWord+"(?:$|\\s+)", "");
        System.out.println(str);
     }
    

    prints

    Johnathan Doe
    Doe