I'm trying to write the result of a cascalog query into a MySQL-Database. For this, I'm using cascading-jdbc and following an example i found here. I'm using cascading-jdbc-core
and cascading-jdbc-mysql
in version 3.0.0
.
I'm executing precisely this code from my REPL:
(let [data [["foo1" "bar1"]
["foo2" "bar2"]]
query-params (into-array String ["?col1" "?col2"])
column-names (into-array String ["col1" "col2"])
update-params (into-array String ["?col1"])
update-column-names (into-array String ["col1"])
jdbc-tap (fn []
(let [scheme (JDBCScheme.
(Fields. query-params)
column-names
nil
(Fields. update-params)
update-column-names)
table-desc (TableDesc.
"test_table"
query-params
column-names
(into-array String []))
tap (JDBCTap.
"jdbc:mysql://192.168.99.101:3306/test_db?user=root&password=my-secret-pw"
"com.mysql.jdbc.Driver"
table-desc
scheme)]
tap))]
(?<- (jdbc-tap)
[?col1 ?col2]
(data ?col1 ?col2)))
When I'm running the code, I'm seeing these logs inside the REPL:
15/12/11 11:08:44 INFO hadoop.FlowMapper: sinking to: JDBCTap{connectionUrl='jdbc:mysql://192.168.99.101:3306/test_db?user=root&password=my-secret-pw', driverClassName='com.mysql.jdbc.Driver', tableDesc=TableDesc{tableName='test_table', columnNames=[?col1, ?col2], columnDefs=[col1, col2], primaryKeys=[]}}
15/12/11 11:08:44 INFO mapred.Task: Task:attempt_local1324562503_0006_m_000000_0 is done. And is in the process of commiting
15/12/11 11:08:44 INFO mapred.LocalJobRunner:
15/12/11 11:08:44 INFO mapred.Task: Task 'attempt_local1324562503_0006_m_000000_0' done.
15/12/11 11:08:44 INFO mapred.LocalJobRunner: Finishing task: attempt_local1324562503_0006_m_000000_0
15/12/11 11:08:44 INFO mapred.LocalJobRunner: Map task executor complete.
Everything looks fine. However, no data is written. I checket with tcpdump
that not even a connection with my local MySQL-database is being established. Also, when I change the JDBC-connection-string to obvious wrong values (user names that do not exist, a non-existing DB name and even a non-existing IP for the DB server), I get the same logs that do not complain about anything.
Also, changing the jdbc-tap
to stdout
produces the expected values.
I do not know at all how to debug this. Is there a possibility to produce error outputs? Right now, I have no clue what is going wrong.
As it turns out, I was using the wrong version of cascading-jdbc
. Cascalog 2.1.1
is using Cascading 2.5.3
. Switching to a 2.5
version fixed the problem.
I was not able to see this from the error messages though (as there were none). One of the developers of cascading-jdbc
was kind enough to point this out to me.