databricksazure-databricksdatabricks-sql

Databricks 'Socket Closed' when trying to view Sample Data from Hive_MetaStore


When attempting to view Sample Data from Hive_MetaStore I keep on getting the error Socket Closed.

enter image description here

Can someone let me know what could be cause of this problem?


Solution

  • Error Message [DataDirect][ODBC Progress OpenEdge Wire Protocol driver]Socket closed [DataDirect][ODBC DB2 Wire Protocol driver]Socket closed

    The "Socket Closed" error usually indicates a network connectivity problem between the client and server, possibly due to issues like firewall settings, network configurations, or interruptions.

    The below could be some reasons:

    An ODBC Trace is needed to see on what ODBC call the 'Socket closed' has been returned.

    Trace=1 [Start the trace] TraceOptions=3 [Thread identification and timestamps information is also present] ODBCTraceFlush=1 [Helps to write all the content into the file]

    Know more to create an ODBC trace log on Windows platforms (embedded video)

    Using below approach you can increase the connection timeout by setting the appropriate configuration parameter.

    from pyspark.sql import SparkSession
    spark = SparkSession.builder \
        .appName("Increase Connection Timeout") \
        .config("spark.sql.hive.metastore.jars", "600s") \
        .getOrCreate()
    

    spark.sql.hive.metastore.jars configuration property to 600s which means the connection timeout will be increased to 10 minutes.

    Reference: DataDirect What does a "socket closed" error mean with an ODBC driver?