clojureclojure-core.logic

Debugging a slow performing function in Clojure


I am trying to implement a solution for minimum-swaps required to sort an array in clojure.

The code works, but takes about a second to solve for the 7 element vector, which is very poor compared to a similar solution in Java. (edited) I already tried providing the explicit types, but doesnt seem to make a difference I tried using transients, but has an open bug for subvec, that I am using in my solution- https://dev.clojure.org/jira/browse/CLJ-787

Any pointers on how I can optimize the solution?

;; Find minimumSwaps required to sort the array. The algorithm, starts by iterating from 0 to n-1. In each iteration, it places the least element in the ith position. 

(defn minimumSwaps [input]
  (loop [mv input, i (long 0), swap-count (long 0)]
    (if (< i (count input))
       (let [min-elem (apply min (drop i mv))]
        (if (not= min-elem (mv i))
          (recur (swap-arr  mv i min-elem),
                 (unchecked-inc i),
                 (unchecked-inc swap-count))
          (recur mv,
                 (unchecked-inc i),
                 swap-count)))
      swap-count)))

(defn swap-arr [vec x min-elem]
  (let [y (long (.indexOf vec min-elem))]
    (assoc vec x (vec y) y (vec x))))

(time (println (minimumSwaps [7 6 5 4 3 2 1])))

Solution

  • There are a few things that can be improved in your solution, both algorithmically and efficiency-wise. The main improvement is to remember both the minimal element in the vector and its position when you search for it. This allows you to not search for the minimal element again with .indexOf.

    Here's my revised solution that is ~4 times faster:

    (defn swap-arr [v x y]
      (assoc v x (v y) y (v x)))
    
    (defn find-min-and-position-in-vector [v, ^long start-from]
      (let [size (count v)]
        (loop [i start-from, min-so-far (long (nth v start-from)), min-pos start-from]
          (if (< i size)
            (let [x (long (nth v i))]
              (if (< x min-so-far)
                (recur (inc i) x i)
                (recur (inc i) min-so-far min-pos)))
            [min-so-far min-pos]))))
    
    (defn minimumSwaps [input]
      (loop [mv input, i (long 0), swap-count (long 0)]
        (if (< i (count input))
          (let [[min-elem min-pos] (find-min-and-position-in-vector mv i)]
            (if (not= min-elem (mv i))
              (recur (swap-arr mv i min-pos),
                     (inc i),
                     (inc swap-count))
              (recur mv,
                     (inc i),
                     swap-count)))
          swap-count)))
    

    To understand where are the performance bottlenecks in your program, it is better to use https://github.com/clojure-goes-fast/clj-async-profiler rather than to guess.

    Notice how I dropped unchecked-* stuff from your code. It is not as important here, and it is easy to get it wrong. If you want to use them for performance, make sure to check the resulting bytecode with a decompiler: https://github.com/clojure-goes-fast/clj-java-decompiler

    A similar implementation in java, runs almost in half the time.

    That's actually fairly good for Clojure, given that you use immutable vectors where in Java you probably use arrays. After rewriting the Clojure solution to arrays, the performance would be almost the same.