I have this query:
(?<- (hfs-textline data-out :sinkmode :replace)
[?item1 ?item2]
((hfs-textline data-in) ?line)
(data-line? ?line)
(filter-out-data (#(vector (s/split % #",")) ?line) :> ?item1 ?item2)
)
(defn data-line? [^String row]
(and (not= -1 (.indexOf row ","))
(not (.endsWith row ","))
(not (.startsWith row ","))))
(defn filter-out-data [data]
(<- [?item1 ?item2]
(data :#> 9 {4 ?item1
8 ?item2})))
The query reads CSV file line by line and checks for lines that meet valid data conditions (data-line?
) - this part works. Then it is supposed to split the line by commas, and pass the vector to filter-out-data
function, which in turn returns two items extracted from that vector. When I execute the query I get the following error:
Unable to resolve symbol: ?line
in this context.
I have been trying out different ways of passing the result of split (I would like it to be flexible as the split will differ in size). I am just starting with Clojure and Cascalog and I will be grateful if you could point me in the right direction. Thanks!
The function filter-out-data
generates a subquery but you are trying to use it as a predicate and that is not going to work.
I recommend you to move all the logic in the expression (#(vector (s/split % #",")) ?line)
to a regular function that you can still call fill-out-data
.
(defn filter-out-data [data]
(let [[_ _ _ item1 _ _ _ item2] (s/split % #"," data))]
[item1 item2]))
(?<- (hfs-textline data-out :sinkmode :replace)
[?item1 ?item2]
((hfs-textline data-in) ?line)
(data-line? ?line)
(filter-out-data ?line :> ?item1 ?item2))
However, you can simplify even more the code by using a CSV library like data.csv.