I am currently implementing selectionsort. Using the code below and the driver file to test it below. I am currently trying to do micro optimizations to see what speeds it up.
public static void selectionsort(int[] data) {
for (int i = 0; i < data.length; i++) {
int minx = i;
for (int j = i+1; j < data.length; j++) {
if(data[j] < data[minx]){
minx = j;
}
// else{
// minx=minx;
// }
}
int swap = data[minx];
data[minx] = data[i];
data[i] = swap;
}
}
import java.util.Arrays;
import java.util.Random;
public class Driver {
public static void main(String[] args) {
Random r = new Random(12);
// CODE-BLOCK-A
// run selection sort version of tests
double avg = 0;
for (int i = 0; i < 100; i++) {
long time = System.currentTimeMillis();
int[] sgarr = randIntArr(Integer.parseInt(args[0]),r);
int[] tests = Arrays.copyOf(sgarr, sgarr.length);
switch (args[1]) {
case "selection":
Sorts.selectionsort(sgarr);
break;
case "bubble":
Sorts.bubblesort(sgarr);
default:
Sorts.insertionsort(sgarr);
break;
}
Arrays.sort(tests);
if (!equals(tests, sgarr)) {
System.out.println("FAIL TO SORT!");
System.exit(0);
}
// System.out.println("SUCCESS!");
long etime = System.currentTimeMillis();
avg += (etime-time);
}
System.out.println("Took "+ avg / (100.0*1000.0) +" seconds");
}
public static int[] randIntArr(int count,Random r) {
int seed = r.nextInt(12433);
r.setSeed(seed);
int[] arr = new int[count];
for (int i = 0; i < count; i++) {
arr[i] = (int) (r.nextDouble() * 100000);
}
return arr;
}
public static boolean equals(int[] d1, int[] d2) {
if (d1.length != d2.length) {
return false;
}
for (int i = 0; i < d2.length; i++) {
if (d1[i] != d2[i]) {
return false;
}
}
return true;
}
}
What I have tried doing to speed it up is using an else block that should do nothing. I expected this to slow it down as now it is checking an if statement every time like normal but also assigning minx to itself which should slow it down. But if I put an empty else statement the improvements go away.
However when I run it using this command, when the else block is commented out I get:
javac Driver.java
&& java Driver 40000 selection
Took 0.34416 seconds
However, when I uncomment out the else block and run the same command it gets noticeably faster
Took 0.30086 seconds
Meaning for the best case scenarios on both it is about a 13.43% difference.
I am assuming that this change is significant enough to not just be random considering I also ran each one multiple times. I then asked my CS teacher about it and he explained it is probably branch prediction. Which does make sense however what I wonder is if I get the same performance enhancements without an essentially useless else statement or if I can't what is considered better practice to keep or to remove it. Or is 13.43% significant, I don't believe it is due to my computer because I ran it many times and the values were within 3% of each time. This being ran on x86, if that matters.
Tested currently using this main file
package org.example;
import org.openjdk.jmh.annotations.*;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class Main {
@Param({"100", "1000", "10000"}) // You can add more values for array sizes
private int arraySize;
private int[] randomArray;
@Setup(Level.Iteration)
public void setup() {
// Initialize the array with random data
randomArray = createRandomArray(arraySize);
}
@Benchmark
public void selectionSortBenchmark() {
Sorts.selectionsort(randomArray.clone());
}
private int[] createRandomArray(int size) {
int[] array = new int[size];
for (int i = 0; i < size; i++) {
array[i] = (int) (Math.random() * 1000); // Adjust the range as needed
}
return array;
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(Main.class.getSimpleName())
.forks(1)
.warmupIterations(5)
.measurementIterations(5)
.build();
new Runner(opt).run();
}
}
The reason as @PeterCordes said in the comments was a testing issue when using correct testing software like JMH all the data points are very similar and definitely within my computer have more or less resources available or some other randomness-ish. Here are the results when using JMH:
Commented out else block
Main.selectionSortBenchmark 100 avgt 5 0.003 ± 0.001 ms/op
Main.selectionSortBenchmark 1000 avgt 5 0.266 ± 0.050 ms/op
Main.selectionSortBenchmark 10000 avgt 5 19.087 ± 0.702 ms/op
Not commented out else block
Main.selectionSortBenchmark 100 avgt 5 0.003 ± 0.001 ms/op
Main.selectionSortBenchmark 1000 avgt 5 0.234 ± 0.010 ms/op
Main.selectionSortBenchmark 10000 avgt 5 18.468 ± 3.204 ms/op
Empty else block
Main.selectionSortBenchmark 100 avgt 5 0.003 ± 0.001 ms/op
Main.selectionSortBenchmark 1000 avgt 5 0.273 ± 0.005 ms/op
Main.selectionSortBenchmark 10000 avgt 5 20.022 ± 2.740 ms/op