pythonjdbcapache-sparkjpypejaydebeapi

JPype and JayDeBeAPI returns jpype._jclass.java.lang.Long


I'm using JayDeBeAPI in PySpark (the Apache Spark Python API), and here's the beginning of my code (note, I'm actually running all this through an interactive shell with PySpark).

import jaydebeapi
import jpype

conn = jaydebeapi.connect('org.apache.phoenix.jdbc.PhoenixDriver',
                  ['jdbc:phoenix:hostname', '', ''])

I am querying Apache Phoenix, which is an SQL "front-end" for Apache HBase.

Here's my Python code for the SQL query:

curs = conn.cursor()
curs.execute('select "username",count("username") from "random_data" GROUP BY "username"')
curs.fetchall()

The output I'm getting is like this for all the rows:

(u'Username', <jpype._jclass.java.lang.Long object at 0x25d1e10>)

How can I fix it so that it actually shows the value of that returned column (the count column)?

From the Apache Phoenix datatypes page, the datatype of the count column is BIGINT, which is mapped to java.lang.Long, but for some reason jpype is not displaying the result.

I got JayDeBeAPI 0.1.4 and JPype 0.5.4.2 by python setup.py install when I downloaded them.


Solution

  • The object returned by JPype is a Python version of Java's java.lang.Long class. To get the value out of it, use the value attribute:

    >>> n = java.lang.Long(44)
    >>> n
    <jpype._jclass.java.lang.Long object at 0x2377390>
    >>> n.value
    44L
    

    JayDeBeApi contains a dict (_DEFAULT_CONVERTERS) that maps types it recognises to functions that convert the Java values to Python values. This dict can be found at the bottom of __init__.py in the JayDeBeApi source code. BIGINT is not included in this dict, so objects of that database type don't get mapped out of Java objects into Python values.

    It's fairly easy to modify JayDeBeApi to add support for BIGINTs. Edit the __init__.py file that contains most of the JayDeBeApi code and add the line

        'BIGINT': _java_to_py('longValue'),
    

    to the _DEFAULT_CONVERTERS dict.