javaalgorithm

Find smallest string after a single substring rotation


I wrote a solution for this code challenge:

Given a string of lower case English letters of length n, select a substring in it and rotate that by 1 position and do this only once such that the result is alphabetically smallest.

For doing right shift operation, just move characters to right and pick the last character to first position of substring. see below example

Example

Input: "aahhab"
Output: "aaahhb"

Explanation:

Selected substring is [2,4] = "hha". It changes to "ahh"

So the result is the left portion + the substring + the right portion, which is "aa" + "ahh" + "b" = "aaahhb"

Constraints

  • 1 <= n <= 105
  • The input has only lower case English letters

My code

     public static String solve(String input) {
        int n = input.length();
        String result = input;
        
        for (int i = 0; i < n; i++) {
            for (int j = i + 1; j <= n; j++) {
                // Extract substring and rotate it
                String substring = input.substring(i, j);
                String rotated = rotateRight(substring);
                
                // Form new message
                String newMessage = input.substring(0, i) + rotated + input.substring(j);
                
                // Update result if new message is lexicographically smaller
                if (newMessage.compareTo(result) < 0) {
                    result = newMessage;
                }
            }
        }
        return result;
    }
    
    private static String rotateRight(String s) {
        if (s.length() <= 1) {
            return s;
        }
        return s.charAt(s.length() - 1) + s.substring(0, s.length() - 1);
    }

My code runs in O(n^3) because of nested loops and the substring method. The problem is that this is too slow for longer inputs: for a string of size 10000, it already takes a minute to complete. I need it to complete within a fraction of a second.

How to solve this in less than O(n^2) time complexity?


Solution

  • You can find the substring (that needs rotation) in linear time.

    Here is an implementation:

        public static String solve(String input) {
            int n = input.length();
            String result = input;
    
            int j = 0; // The last index of the substring (inclusive!)
            char least = 'z';
            char curr = input.charAt(0);
            for (int i = 1; i < n; i++) {
                char prev = curr;
                curr = input.charAt(i);
                if (curr < prev && curr <= least) {
                    j = i;
                    least = curr;
                }
            }
            for (int i = 0; i < j; i++) {
                if (input.charAt(i) > least) {
                    String substring = input.substring(i, j + 1);
                    String rotated = rotateRight(substring);
                    return input.substring(0, i) + rotated + input.substring(j + 1);
                }
            }
            return input; // The input is sorted; no rotation can improve that.
        }
    

    Note that this function returns the resulting string. Up to you to print it.

    Proof

    Let's name the two anchor indices for the substring to be rotated 𝑖 and 𝑗, so that the letter 𝑠[𝑗] will be moved to index 𝑖, and all other letters in between will shift to the right. Let's call the pair (𝑖, 𝑗) the "solution" when the corresponding rotation leads to the lexically smallest result string.

    We can take the following constructive steps for creating a list of candidates for (𝑖, 𝑗) and then reducing that list to just one such pair:

    1. 𝑠[𝑖] > 𝑠[𝑗]

    First, we can observe that if 𝑠[𝑖] < 𝑠[𝑗], then the rotation creates a result string that is lexically greater than the input string, and thus is not a solution. If 𝑠[𝑖] > 𝑠[𝑗], then we get a lexically smaller input string, which is a candidate. Remains the case where 𝑠[𝑖] == 𝑠[𝑗]. Then we have a few cases to consider:

    1. For all 𝑘 with 𝑖 < 𝑘 < 𝑗, we have 𝑠[𝑘] = 𝑠[𝑖]. In this case the rotation does not change the input string; we can discard this option.

    2. Let 𝑘 be the smallest index with 𝑖 < 𝑘 for which 𝑠[𝑘] != 𝑠[𝑖]. Then we have these two sub-cases:

      1. 𝑠[𝑘] < 𝑠[𝑖]. In this case the rotation will result in a greater string: it is not a solution.

      2. 𝑠[𝑘] > 𝑠[𝑖]. In this case the rotation will result in a smaller string. But then we could also have chosen 𝑖 equal to this 𝑘, which gives the same result, so that we arrive in the case where 𝑠[𝑖] > 𝑠[𝑗].

    So we can discard the cases where 𝑠[𝑖] = 𝑠[𝑗], and require that 𝑠[𝑖] > 𝑠[𝑗]. If there is no such index pair, then the input string is the solution. No rotation can improve on that.

    2. Minimise 𝑖

    We can see that among all index pairs (𝑖, 𝑗) that fit the previous condition, we should go for those that minimise 𝑖: as the first letter in the result that is "reduced" (compared to the original string) is the one at index 𝑖, the ones that reduce a letter at a smaller index give a better (lexically smaller) result.

    3. 𝑠[𝑗-1] > 𝑠[𝑗]

    If 𝑠[𝑗-1] = 𝑠[𝑗], then we might as well select 𝑗−1 as the last index for the rotation: the result of the rotation will be the same, so we can discard this case for the solution.

    If 𝑠[𝑗−1] < 𝑠[𝑗], then if we select 𝑗−1 as last index of the rotated substring, we will get a lexically smaller output string. So we should exclude this 𝑗 for the solution.

    4. Select 𝑗 for which 𝑠[𝑗] is minimised

    Of all valid index pairs (𝑖, 𝑗𝑘) that satisfy the above requirements (note that they all have the same 𝑖), get the minimum letter 𝑠[𝑗𝑚] among 𝑠[𝑗𝑘]. Then reject any index pair (𝑖, 𝑗𝑘) for which 𝑠[𝑗𝑘] is not that minimum letter. This is because that 𝑠[𝑗] will be moved to index 𝑖 by the rotation, so the better result is achieved with the smallest letter.

    5. Maximise 𝑗

    Let's say we have two index pairs that satisfy all of the above constraints, (𝑖, 𝑗1) and (𝑖, 𝑗2), with 𝑗1 < 𝑗2. We can visualise the input string as follows (with some example letters filled in):

    "....f...pe...qe...."
         ↑    ↑    ↑
         𝑖    𝑗1    𝑗2
    

    When the rotation on (𝑖, 𝑗1) is performed, we get 𝑠1:

    "....ef...p...qe...."
         ↑    ↑    ↑
         𝑖    𝑗1    𝑗2
    

    If on the other hand, the rotation on (𝑖, 𝑗2) is performed, we get 𝑠2:

    "....ef...pe...q...."
         ↑    ↑    ↑
         𝑖    𝑗1    𝑗2
    

    Now note how the two result strings only differ by one rotation on the interval (𝑗1+1, 𝑗2). If we apply that rotation on 𝑠1, we get 𝑠2: that rotation re-inserts the common letter (that was originally at both indices 𝑗1 and 𝑗2, and which is "e" in the example) at index 𝑗1+1, just "as if" it was shifted right from its source position at index 𝑗1, and "as if" the one that was originally at index 𝑗2 was re-inserted at index 𝑖.

    What can we say about the effect on 𝑠1 when we apply a rotation on (𝑗1+1, 𝑗2)?

    Let 𝑘 be the first index after 𝑗1 that has a letter greater than the (common) letter at 𝑗2. We know 𝑘 will be less than 𝑗2 because 𝑗2−1 is a candidate. Then all letters between indices 𝑗1 and 𝑘−1 (inclusive) are the same (cf. rule #4 above). From this follows that 𝑠2 is lexically smaller than 𝑠1.

    The conclusion is that we should maximise 𝑗. This gives us one unique (𝑖, 𝑗) pair for which the corresponding rotation necessarily gives the lexically minimal result string.

    The algorithm follows from the above five reductions of the candidate set.