javamergesortcanonical-form

finding canonical form of a word (related to mergeSort, java)


I am stuck at making the canonical form of a word(canonical form of a word contains the same letters as the original word, yet in sorted order. eg. canonical form of "computer" is "cemoprtu" (consider their alphabetical order), that of "program" is "agmoprr". )

I use mergeSort to sort the letters of a word, using the following code. Yet it just give me the original word instead of the canonical form. can anyone tell me what's wrong with my code?

public static String canonicalForm(String word) {
    char[] string = new char[word.length()];
    for (int i = 0; i < word.length(); i ++) {
        string[i] = word.charAt(i);
    }
    mergeSort(string);
    String result = "";
    for (char ch: string) {
        result += ch;
    }
    System.out.println(result);
}

public static void mergeSort(char[] string) {
    if (string.length > 1) {
        char[] left = Arrays.copyOfRange(string, 0, string.length/2);
        char[] right = Arrays.copyOfRange(string, string.length/2, string.length);
        mergeSort(left);
        mergeSort(right);
        merge(string, left, right);
    }
}

public static void merge(char[] string, char[] left, char[] right) {
    int i1 = 0;
    int i2 = 0;
    for (int i = 0; i < string.length; i ++) {
        if (i1 < left.length) {
            if (left[i1] - right[i2] <= 0) {
                string[i] = left[i1];
                i1 ++;
            }
        } else if (i2 >= right.length) {
            string[i] = left[i1];
            i1 ++;
        } else {
            string[i] = right[i2];
            i2 ++;
        }
    }
}
}

Solution

  • The flaw is that your merge doesn't copy symbols from right part if left[i1] - right[i2] > 0.

    This works:

    public static void merge(char[] string, char[] left, char[] right) {
        int i1 = 0;
        int i2 = 0;
        for (int i = 0; i < string.length; i++) {
            if (i1 < left.length && i2 < right.length) {
                if (left[i1] - right[i2] <= 0) {
                    string[i] = left[i1];
                    i1++;
                } else {
                    string[i] = right[i2];
                    i2++;
                }
            } else if (i2 >= right.length) {
                string[i] = left[i1];
                i1++;
            } else {
                string[i] = right[i2];
                i2++;
            }
        }
    }
    

    You have three cases:

    1. There are symbols available in both parts. Take one with the least value.
    2. There are symbols available only in the left part. Copy them.
    3. There are symbols available only in the right part. Copy them.