clojureincanter

Incanter - How can I use filter with column keywords instead of nth?


(require '[incanter.core :as icore])

;; Assume dataset "data" is already loaded by incanter.core/read-dataset

;; Let's examine the columns (note that Volume is the 5th column)
(icore/col-names data)
==> [:Date :Open :High :Low :Close :Volume]

;; We CAN use the :Volume keyword to look at just that column
(icore/sel data :cols Volume)
==> (11886469 9367474 12847099 9938230 11446219 12298336 15985045...)

;; But we CANNOT use the :Volume keyword with filters
;; (well, not without looking up the position in col-names first...)
(icore/sel data :filter #(> (#{:Volume} %) 1000000))

Obviously this is because the filter's anon function is looking at a LazySeq, which no longer has the column names as part of its structure, so the above code won't even compile. My question is this: Does Incanter have a way to perform this filtered query, still allowing me to use column keywords? For example, I can get this to work because I know that :Volume is the 5th column

(icore/sel data :filter #(> (nth % 5) 1000000))

Again, though, I'm looking to see if Incanter has a way of preserving the column keyword for this type of filtered query.


Solution

  • Example dataset:

    (def data
      (icore/dataset
        [:foo :bar :baz :quux]
        [[0 0 0 0]
         [1 1 1 1]
         [2 2 2 2]]))
    

    Example query with result:

    (icore/$where {:baz {:fn #(> % 1)}} data)
    
    | :foo | :bar | :baz | :quux |
    |------+------+------+-------|
    |    2 |    2 |    2 |     2 |
    

    Actually this could also be written

    (icore/$where {:baz {:gt 1}} data)
    

    Several such "predicate keywords" are support apart from :gt: :lt, :lte, :gte, :eq (corresponding to Clojure's =), :ne (not=), :in, :nin (not in).

    :fn is the general "use any function" keyword.

    All of these can be prefixed with $ (:$fn etc.) with no change in meaning.