Why does Java sort outperform Counting sort for primitives

I am comparing the running time of Counting sort with Java native Arrays.sort. Form what I've read the Counting sort offers a best, average and worst case n+k running time. Javas Arrays sort of primitives, using a dual - pivot Quicksort, is a comparison based algorithm thus must offer a O(n log n) in the average case, and On2 worst case.

When comparing the two by measuring time (nanoseconds) taken to sort a series of arrays ranging in size 500 to 100k, I noted a sharp increase in running time for the Counting sort when the size reached ~70k.

My understanding is the Counting sort is efficient as long as the range of input data is not significantly greater then the number of elements to be sorted The arrays are built from random numbers between 0 and 99, so k will always be much smaller than n.

Would there be any particular reason the Counting sort would degenerate so abruptly as n increases ?

My counting sort implementation:

public static int[] countSort(int[] arr, int k) {
        /*
         * This will only work on positive integers 0 to K.
         * For negative  worst case testing we use the negative count sort below.
         * 
         * Step 1: Use an array to store the frequency of each element. For array
         * elements 1 to K initialize an array with size K. Step 2: Add elements of
         * count array so each element stores summation of its previous elements. Step
         * 3: The modified count array stores the position of elements in actual sorted
         * array. Step 5: Iterate over array and position elements in correct position
         * based on modified count array and reduce count by 1.
         */

        int[] result = new int[arr.length];
        int[] count = new int[k + 1];
        for (int x = 0; x < arr.length; x++) {
            count[arr[x]]++;
        }

        /*
         * Store count of each element in the count array Count[y] holds the number of
         * values of y in the array 'arr'
         */

        for (int y = 1; y <= k; y++) {
            count[y] += count[y - 1];
        }

        /*
         * Change count[i] so that count[i] now contains actual Position of this element
         * in result array
         */

        for (int i = arr.length - 1; i >= 0; i--) {
            result[count[arr[i]] - 1] = arr[i];
            count[arr[i]]--;
        }

        System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(result));
        return result;

    }

Resolution: Running the Counting sort in place as per @Alex's suggestion resulted in a far more superior run time.

Solution

Just a guess, but your sorting algorithm uses much more memory than Java’s. 70k of ints are 280KB. You need double the space, more than 512KB. Depending on the processor used, that could make the difference between running the sort in (L1?) cache and having lots of cache misses. Since you don’t really need the copy, do the sort in place. If you now hit the wall later, you have the answer.

Edit: it’s 280KB.

Edit2: It was late yesterday, so here comes the in-place version. Note that it modifies the input array.

public static int[] countSortRefactored(int[] arr, int k) {
    int[] count = new int[k + 1];
    for (int x = 0; x < arr.length; x++) {
        count[arr[x]]++;
    }

    int idx=0;
    for (int x=0; x<=k; x++) {
        Arrays.fill(arr, idx, idx+=count[x], x);
    }

    System.out.println("COUNTSORT SORTED ARRAY = " + Arrays.toString(arr));
    return arr;
}