javastringreplaceall

How to remove multiple words from a string Java


I'm new to java and currently, I'm learning strings.

How to remove multiple words from a string?

I would be glad for any hint.

class WordDeleterTest {
    public static void main(String[] args) {
        WordDeleter wordDeleter = new WordDeleter();

        // Hello
        System.out.println(wordDeleter.remove("Hello Java", new String[] { "Java" }));

        // The Athens in
        System.out.println(wordDeleter.remove("The Athens is in Greece", new String[] { "is", "Greece" }));
    }
}

class WordDeleter {
    public String remove(String phrase, String[] words) {
        String[] array = phrase.split(" ");
        String word = "";
        String result = "";

        for (int i = 0; i < words.length; i++) {
            word += words[i];
        }
        for (String newWords : array) {
            if (!newWords.equals(word)) {
                result += newWords + " ";
            }
        }
        return result.trim();
    }
}

Output:

Hello
The Athens is in Greece

I've already tried to use replacе here, but it didn't work.


Solution

  • Programmers often do this:

    String sentence = "Hello Java World!";
    sentence.replace("Java", "");
    System.out.println(sentence);
    

    => Hello Java World

    Strings are immutable, and the replace function returns a new string object. So instead write

    String sentence = "Hello Java World!";
    sentence = sentence.replace("Java", "");
    System.out.println(sentence);
    

    => Hello World!

    (the whitespace still exists)

    With that, your replace function could look like

    public String remove(String phrase, String[] words) {
        String result = phrase;
        for (String word: words) {
            result = result.replace(word, "").replace("  ", " ");
        }
        return result.trim();
    }
    

    => Hello World!

    (the whitespace is curated)

    Now this solution will remove all occurrences of your word within the phrase - whether it is a word or part of a word. As the OP commented, removing "is" from "This is Sparta" will result in "Th Sparta". To get around that make sure the word to be replaced is embedded between whitespace characters. This is a perfect situation to switch to regular expressions.

    public String remove(String phrase, String[] words) {
        String result = phrase;
        for (String word: words) {
            String regexp = "\\s" + word + "\\s";
            result = result.replaceAll(regexp, " ");
        }
        return result.trim();
    }
    

    For explanation:

    The pattern sequence \s resembles a whitespace (space, tab, linefeed, ...). The double backslash is necessary for the Java compiler to not interprete a single backslash as escape character for something else. So the regular expression matches the word including the whitespaces before and after the word, and replaceAll is instructed to replace that match with a single space. Which also means the second call to remove double blanks is unnecessary now.

    Here is a nice tutorial: https://docs.oracle.com/javase/tutorial/essential/regex/