javaalgorithmdynamic-programming

Extra characters in a string | LeetCode 2707


I am stuck while solving todays daily Leetcode problem and I'm asking for your guys' help. Here is the task:

You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings. Return the minimum number of extra characters left over if you break up s optimally.

Here is the input that we are given:

s = "rkmsilizktprllwoimafyuqmeqrujxdzgp"

dictionary = ["afy","lyso","ymdt","uqm","cfybt","lwoim","hdzeg","th","rkmsi","d","e","tp","r","jx","tofxe","etjx","llqs","cpir","p","ncz","ofeyx","eqru","l","demij","tjky","jgodm","y","ernt","jfns","akjtl","wt","tk","zg","lxoi","kt"]

I'm at a point where I passed 2020/2028 tests, but I just don't understand where the error is in my code:

class Solution {
    public int minExtraChar(String s, String[] dictionary) {
        List<String> dictionary2 = new ArrayList<>(); 
        for (String str: dictionary){
            if (s.contains(str)){
                dictionary2.add(str);
            }
        }
        List<String> sorted = dictionary2.stream()
            .sorted(Comparator.comparingInt(String::length).reversed())
            .collect(Collectors.toList());
        for (String a: sorted){
            System.out.println(a);
        }
        for (String str: sorted){
            if (s.contains(str)){
                System.out.println("----------------------------");
                System.out.println(s);
                System.out.println(str);
                String ak = " ";
                ak = ak.repeat(str.length());
                s = s.replace(str,ak);
            }
        }
        s = s.replace(" ","");
        System.out.println(s);
        return s.length();
    }
}

The output is supposed to be 2 while my output is 3.

I know that this is not the most conventional way of solving this problem, but is there a way to alter this code to make it work or should I start from the beggining?

Thanks for any and all help :).


Solution

  • The greedy approach does not work here; employ dynamic programming instead. We can compute the minimum number of left over characters when considering each prefix of the whole string.

    While iterating over the characters of the string, the current index indicates the ending point of a certain prefix. The current character is a decision point and we can try all possible ways to use the current character.

    If we do not take the current character, the number of left over characters up to this index is one more than the number of left over characters for the best solution at the previous index.

    We can then try all suffixes of the current prefix and check if they exist in the dictionary. If so, we can try updating the current answer for this index with the best answer for the index one before the start of this substring (there are no extra left over characters when using the whole substring).

    This trivially leads to a O(|S|^3) solution (where |S| denotes the length of the input string S). The transition can be optimized to be linear in the worst case with a trie, but that is unnecessary since the strings can only be at most 50 characters in length.

    public int minExtraChar(String s, String[] dictionary) {
        var words = new HashSet<>(Arrays.asList(dictionary));
        var minLeftOver = new int[s.length() + 1];
        for (int i = 1; i <= s.length(); i++) {
            minLeftOver[i] = 1 + minLeftOver[i - 1];
            for (int j = 0; j < i; j++)
                if (words.contains(s.substring(j, i)))
                    minLeftOver[i] = Math.min(minLeftOver[i], minLeftOver[j]);
        }
        return minLeftOver[s.length()];
    }