javaarraysperformancearraylist

Performance of ArrayList<Integer> versus int[]


I have a million ints that I want to store in a data structure. I wanted to know whether the int array (int[]) is any more efficient than an ArrayList. Are there any performance gains to using int[] over ArrayList? I believe that ArrayList uses more memory than int[], but I don't know if that extra memory is negligible or not.

If I were to increase the size of the array (from one million to one billion ints) would the answer change? Is the extra memory used by ArrayList negligible or not?


Solution

  • By default, the capacity of an ArrayList may be up to twice the number of elements it contains. So, if you know your expected size, specify an initial capacity. If the size is not known, trim when finished adding elements.

    However, the main difference between ArrayList<Integer> and int[] is the difference between Integer and int: The former is an object, and each object stores some metadata for the JVM (runtime type, identity hash code, ...) in addition to its fields, which makes it somewhat bigger than a plain int. The size of this metadata is not specificed and may vary. You can measure it as follows:

    public class Test {
        
        static final int elements = 10_000_000;
        
        private static long usedMemory() {
            var r = Runtime.getRuntime();
            return r.totalMemory() - r.freeMemory();
        }
        
        static void measure(String name, Runnable code) {
            System.gc(); // clean up previous tests
            long start = usedMemory();
            code.run();
            long end = usedMemory();
            System.out.println(name + " ".repeat(35 - name.length()) + " used " + (double)(end - start) / elements + " bytes/element");
        }
        
        public static void main(String[] args) {
            measure("int[]", () -> { 
                var a = new int[elements];
            });
            measure("Integer[] with small numbers", () -> {
                var a = new Integer[elements];
                for (int i = 0; i < elements; i++) {
                    a[i] = i % 64;
                }
            });
            measure("Integer[] with large numbers", () -> {
                var a = new Integer[elements];
                for (int i = 0; i < elements; i++) {
                    a[i] = i;
                }
            });
            measure("ideal capacity ArrayList<Integer>", () -> {
                var a = new ArrayList<Integer>(elements);
                for (int i = 0; i < elements; i++) {
                    a.add(i);
                }           
            });
            measure("auto sized ArrayList<Integer>", () -> {
                var a = new ArrayList<Integer>();
                for (int i = 0; i < elements; i++) {
                    a.add(i);
                }
            });
        }
    }
    

    On my JVM, this prints:

    int[]                               used 4.1809328 bytes/element
    Integer[] with small numbers        used 4.1949816 bytes/element
    Integer[] with large numbers        used 20.7624576 bytes/element
    ideal capacity ArrayList<Integer>   used 20.8476584 bytes/element
    auto sized ArrayList<Integer>       used 30.3283208 bytes/element
    

    That is, on this JVM, an Integer object takes up 16 bytes, and a refence to an object 4 bytes (if my heap were large enough, it would need 8 bytes instead). If the Integer contain small numbers, the Integer objects will be reused, and you only pay for the reference. In contrast, if the Integer objects contain large numbers, a new Integer is created for every element.

    An ArrayList with ideal capacity is as big as an array of references. An ArrayList with automatic growth may be up to twice as large as needed.