pysparkazure-databricksazure-data-explorerkusto-explorer

java.io.UncheckedIOException: io.netty.channel.StacklessClosedChannelException while writing to adx


I'm trying to write some data to azure data explorer table, but getting the below exception. The adx cluster is open to all the networks and could see the connectivity between databricks and adx is successful using nslookup and get endpoint connection status.

Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 12) (10.139.64.7 executor 0): java.io.UncheckedIOException: io.netty.channel.StacklessClosedChannelException

Sample Code:

spark = SparkSession.builder \
    .appName("ADXWriter") \
    .getOrCreate()


kustoOptions = {"kustoCluster":"https://", "kustoDatabase" : "", "kustoTable" : "", "kustoAadAppId":"" ,
 "kustoAadAppSecret":"", "kustoAadAuthorityID":""} 


df.write. \
  format("com.microsoft.kusto.spark.datasource"). \
  option("kustoCluster",kustoOptions["kustoCluster"]). \
  option("kustoDatabase",kustoOptions["kustoDatabase"]). \
  option("kustoTable", kustoOptions["kustoTable"]). \
  option("kustoAadAppId",kustoOptions["kustoAadAppId"]). \
  option("kustoAadAppSecret",kustoOptions["kustoAadAppSecret"]). \
  option("kustoAadAuthorityID",kustoOptions["kustoAadAuthorityID"]). \
  mode("Append"). \
  save()

Databricks runtime : 14.1 (includes Apache Spark 3.5.0, Scala 2.12) com.azure:azure-identity:1.13.0

com.microsoft.azure.kusto:kusto-data:5.1.0

com.microsoft.azure.kusto:kusto-ingest:5.1.0

com.microsoft.azure.kusto:kusto-spark_3.0_2.12:5.1.0

I'm I missing anything. Is there any insight to fix this?


Solution

  • This is because of classpath conflicts with io.netty dependencies. There are 2 things you can try

    a) com.microsoft.azure.kusto:kusto-spark_3.0_2.12:5.1.0 will resolve ingest and data dependencies. You can exclude them and see if it works

    Use the uber jar in the releases below and use this. This has relocated and shaded classes for netty and will work even when other dependencies bring in conflicting classes. This will be published to maven as well in upcoming releases

    b) https://github.com/Azure/azure-kusto-spark/releases/tag/v3.0_5.2.2-SNAPSHOT