scalaapache-sparkhivespark-thriftserver

Spark 2.x using HiveThriftServer2 with sqlContext


My requirement is to enable ODBC/JDBC access to a SparkSQL temporary table, for which there was a DataFrame in Spark (mix of JSON-based and streaming).

I made it working in Spark 1.6, and then recently upgraded to Spark to 2.1.1. I adjusted my code as the second answering person in this question. I noticed the deprecation warning on this clause, however:

val sqlContext = new org.apache.spark.sql.SQLContext(spark.sparkContext)

So I checked javadoc on sqlContext, and it says "Deprecated. Use SparkSession.builder instead. Since 2.0.0." But then, accordingly to even the latest HiveThriftserver2.scala code in git, the method startWithContext requires the parameter of type sqlContext.

So, could anyone in the know shed some light on this:

  1. Have I picked the right way to solve the problem, in the first place? I'd love not to start HiveThriftServer2 from within my Spark code, but then /sbin/start-thriftserver.sh doesn't provide me an option to start a thriftserver instance with my class. Or, does it, and I'm just missing it?

  2. Is there another way to start HiveThriftServer2 from the Spark code, using SparkSession?


Solution

  • You don't have to create an SQLContext any more. Juste take it from spark session.

    val spark: SparkSession = SparkSession
            .builder()
            .appName("Your application name")
            .getOrCreate()
    
    val sqlContext: SQLContext = spark.sqlContext
    HiveThriftServer2.startWithContext(sqlContext)