javaalgorithmtime-complexity

Find longest substring


Given a string and list of words, I want to find the longest substring such that it does not have any word present in the provided list of words.

Constraints:

The length of the string is 1 to 10^5, with no spaces
The size of the words list is 1 to 10, with no spaces, and each word length is in the range of 1 to 10.

Example:

s = "helloworld"
words = ["wor", "rld"]

Solution :

Valid longest substring: hellowo , here we don't have "wor", "rld" in this substring. 
Result = 7

Here is my code, which works fine for smaller inputs:

int solve(String s, List<String> list) {
    int answer = 0, j=0;
    for(int i = 0 ; i <= s.length(); i++) {
        String sub = s.substring(j, i);
        for(String e : list) {
            if(sub.contains(e)){
                j = j+1;
                sub = s.substring(j, i);
            }
        }
        answer = Math.max(answer, i-j);
    }
    return answer;
}

The time complexity for this code is I assume O(m*n^2) where n is the size of the input string and m is the size of the list.

How to reduce time complexity and solve this with a better approach.

Edit: Updated the time complexity based on meriton explanation.


Solution

  • Your complexity analysis is wrong, because substring and contains are not constant time operations, but depend upon the length of the strings involved.

    Suppose the input string doesn't contain any search word. Then, you'll do substrings of length 1, 2, 3, ..., n, and call contains on each of them m times, for a total complexity of O(mn^2) ...

    To get an actual O(mn) algorithm, I'd do something like this:

    var answer = 0;
    var j = 0;
    for (int i = 0; i < s.length; i++) {
        while (endsWith(s, j, i, words) j++;
        answer = Math.max(answer, i - j);
    }
    

    where

    /** @return whether s[begin..end] ends with a word in words */
    boolean endsWith(String s, int begin, int end, List<String> words) {
        for (var word : words) {
            if (endsWith(s, begin, end, word)) {
                return true;
            }
        }
    }
    
    boolean endsWith(String s, int begin, int end, String word) {
        var offset = end - word.length;
        if (offset < begin) return false;
        for (int i = 0; i < word.length; i++) {
            if (s.charAt(offset + i) != word.charAt(i)) return false;
        }
        return true;
    }
    

    (this code is untested, you may have to fix minor bugs, but the general idea should be solid)

    Can you please share time complexity of this solution, I still see it as O(m*n^2)

    endsWith is O(m), and invoked at most 2n times, because each invocation increases i or j, each of which is increased at most n times.