amazon-s3jdbcaws-glueojdbcjaydebeapi

jaydebeapi cannot find jar on AWS Glue


I'm trying to connect to an Oracle database via jaydebeapi in Python on AWS Glue. I'm getting the error that says:

TypeError: Class oracle.jdbc.driver.OracleDriver is not found

I believe the error is resulted from jaydebeapi not able to find the ojdbc jar, as the same code worked locally with the path being a local path.

What should I do on AWS Glue for jaydebeapi to recognize the passed s3 path to the jar? I've tried both passing the path to Dependent JARs path field, and specifying --extra-jars in the Job parameters field.

Here is my code:

import jaydebeapi
props = {
   "user": "user",
   "password": "password",
   "oracle.jdbc.timezoneAsRegion": "false"
}
conn = jaydebeapi.connect("oracle.jdbc.driver.OracleDriver",
  "jdbc:oracle:thin:/oracle@host:port/orcl", 
  props, 
  "s3://path/to/ojdbc8-21.4.0.0.1.jar", 
  libs=None)
with conn.cursor() as curs:
  curs.execute("CREATE SEQUENCE SCHEMA.TABLE")

Thank you in advance for sharing your insights!


Solution

  • It turns out jaydebeapi library can't recognize S3 path as it's not local to the environment. I had to download the jar from S3 to the /tmp/ directory of AWS Glue first as follow:

    import boto3
    s3 = boto3.resource('s3')
    s3.Bucket('bucket_name').download_file('external-lib/jar/ojdbc8-21.4.0.0.1.jar','ojdbc8-21.4.0.0.1.jar')
    

    Then I can load the jar in the /tmp/:

    import jaydebeapi
    props = {
       "user": "user",
       "password": "password",
       "oracle.jdbc.timezoneAsRegion": "false"
    }
    conn = jaydebeapi.connect("oracle.jdbc.driver.OracleDriver",
      "jdbc:oracle:thin:/oracle@host:port/orcl", 
      props, 
      "/tmp/ojdbc8-21.4.0.0.1.jar", 
      libs=None)
    with conn.cursor() as curs:
      curs.execute("CREATE SEQUENCE SCHEMA.TABLE")