hadoopapache-sparkapache-spark-sqlmetastore

unable to connect to sparkSQL


I am using remote mysql metastore for hive. when i run hive client it runs perfect. but when i try to use spark-sql either via spark-shell or by spark-submit i am not able to connect to hive. & getting following error :

    Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver

I am not getting why spark tries to connect derby database while i am using mysql database for metastore.

i am using apache spark version 1.3 & cloudera version CDH 5.4.8


Solution

  • It seems spark is using default hive settings, follow these steps:

    I believe your hive-site.xml has location of MYSQL metastore? if not, follow these steps and restart spark-shell:

    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://MYSQL_HOST:3306/hive_{version}</value>
        <description>JDBC connect string for a JDBC metastore</description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
        <description>Driver class name for a JDBC metastore/description>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>XXXXXXXX</value>
        <description>Username to use against metastore database/description>
    </property> 
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>XXXXXXXX</value>
        <description>Password to use against metastore database/description>
    </property>