I'm using JayDeBeAPI in PySpark (the Apache Spark Python API), and here's the beginning of my code (note, I'm actually running all this through an interactive shell with PySpark).
import jaydebeapi
import jpype
conn = jaydebeapi.connect('org.apache.phoenix.jdbc.PhoenixDriver',
['jdbc:phoenix:hostname', '', ''])
I am querying Apache Phoenix, which is an SQL "front-end" for Apache HBase.
Here's my Python code for the SQL query:
curs = conn.cursor()
curs.execute('select "username",count("username") from "random_data" GROUP BY "username"')
curs.fetchall()
The output I'm getting is like this for all the rows:
(u'Username', <jpype._jclass.java.lang.Long object at 0x25d1e10>)
How can I fix it so that it actually shows the value of that returned column (the count
column)?
From the Apache Phoenix datatypes page, the datatype of the count
column is BIGINT, which is mapped to java.lang.Long
, but for some reason jpype
is not displaying the result.
I got JayDeBeAPI 0.1.4 and JPype 0.5.4.2 by python setup.py install
when I downloaded them.
The object returned by JPype is a Python version of Java's java.lang.Long
class. To get the value out of it, use the value
attribute:
>>> n = java.lang.Long(44)
>>> n
<jpype._jclass.java.lang.Long object at 0x2377390>
>>> n.value
44L
JayDeBeApi contains a dict (_DEFAULT_CONVERTERS
) that maps types it recognises to functions that convert the Java values to Python values. This dict can be found at the bottom of __init__.py
in the JayDeBeApi source code. BIGINT
is not included in this dict, so objects of that database type don't get mapped out of Java objects into Python values.
It's fairly easy to modify JayDeBeApi to add support for BIGINT
s. Edit the __init__.py
file that contains most of the JayDeBeApi code and add the line
'BIGINT': _java_to_py('longValue'),
to the _DEFAULT_CONVERTERS
dict.