javaspring-boothibernateapache-spark-sql

Hive/SparkSQL Dialect for Hibernate/Springboot


Edit: It turns out this error I was getting is from Databricks, I was not properly specifying the triple name space it was expecting (catalog_name.schema_name.table_name). After I did this and fixed the connection, creating a custom dialect like described in the answer below seems to be the next step to get the return data to play nice with Hibernate and Spring.

I have a Springboot Webapp that currently connects to a MySQL database. I would like to change this connection to a SparkSQL one using Databrick's JDBC driver.

After changing the connection details, my app is able to fully start and connect (or at least initialize the connection pool) but there are a few hibernate calls that happen whenever someone logs in which are all failing. It seems the only reason they are failing is due to single backticks being inserted into the query automatically. SparkSQL seems to hate these and will give me a not found exception:

Caused by: org.apache.spark.sql.catalyst.ExtendedAnalysisException: [TABLE_OR_VIEW_NOT_FOUND] The table or view `TABLE_NAME` cannot be found. Verify the spelling and correctness of the schema and catalog.

I found this question from 2017 stating there is not currently a SparkSQL dialect for Hibernate. And the docs confirm this. I tried other dialects, including the one from apache (apache derby) but none seem to work.

Here are the hibernate jpa properties I'm currently using:

jpa:
  show-sql: true
  hibernate:
    ddlAuto: none
    naming:
      implicit-strategy: org.hibernate.boot.model.naming.ImplicitNamingStrategyLegacyHbmImpl
      physical-strategy: org.springframework.boot.orm.jpa.hibernate.SpringPhysicalNamingStrategy
  properties:
    hibernate:
      dialect: org.hibernate.dialect.MySQLDialect
      default_schema: schema
  database: default

Can anyone offer any guidance with this issue? I am in-between saying this is not currently possible or that I have to rip out all of my Hibernate queries for standard JDBC ones. But to me, is seems the only thing I need to figure out is to how to force those backticks from not being generated in the query.


Solution

  • You can try to override this behaviour in your custom hibernate dialect:

    import org.hibernate.dialect.MySQLDialect;
    
    
    public class MySparkSQLDialect extends MySQLDialect {
    
        // ...
    
        @Override
        public char closeQuote() {
            return '"';
        }
    
        @Override
        public char openQuote() {
            return '"';
        }
    }
    

    P.S. To be honest I am not aware with Spark SQL , so maybe instead of " you need to use some other symbol.