clojureincanter

Clojure / Incanter Data Transformations Capabilities


I'm considering Clojure / Incanter as an alternative to R and just wondering if clojure / incanter have the capabilities to do the following:

  1. Import the result of a SQL statement as a data-set ( I do this in R using dbGetQuery ).
  2. Reshape the data-set - turning rows into columns also known as "pivot" / "unpivot"- I do this in R using the reshape, reshape2 packages ( in the R world it's called melting and casting data ).
  3. Save the reshaped data-set to a SQL table ( I do this in R using dbWriteTable function in RMySQL )

Solution

  • You may be interested in core.matrix - it's a project to bring multi-dimensional array and numerical computation capabilities into Clojure. Still in very active development but already usable.

    Features:

    See some example code here:

      ;; a matrix can be defined using a nested vector
      (def a (matrix [[2 0] [0 2]]))
    
      ;; core.matrix.operators overloads operators to work on matrices
      (* a a)
    
      ;; a wide range of mathematical functions are defined for matrices
      (sqrt a)  
    
      ;; you can get rows and columns of matrices individually
      (get-row a 0)
    
      ;; Java double arrays can be used as vectors
      (* a (double-array [1 2]))
    
      ;; you can modify double arrays in place - they are examples of mutable vectors
      (let [a (double-array [1 4 9])]
        (sqrt! a)   ;; "!" signifies an in-place operator
        (seq a))
    
      ;; you can coerce matrices between different formats
      (coerce [] (double-array [1 2 3]))
    
      ;; scalars can be used in many places that you can use a matrix
      (* [1 2 3] 2)
    
      ;; operations on scalars alone behave as you would expect
      (* 1 2 3 4 5)
    
      ;; you can do various functional programming tricks with matrices too
      (emap inc [[1 2] [3 4]])
    

    core.matrix has been approved by Rich Hickey as an official Clojure contrib library, and it is likely that Incanter will switch over to using core.matrix in the future.

    SQL table support isn't directly included in core.matrix, but it would only be a one-liner to convert a resultset from clojure.java.jdbc into a core.matrix array. Something like the following should do the trick:

    (coerce [] (map vals resultset))
    

    Then you can transform and process it with core.matrix however you like.