apache-sparkapache-kafkaspark-streaming-kafka

import org.apache.spark.streaming.kafka._ Cannot resolve symbol kafka


I have created one spark application to integrate with kafka and get stream of data from kafka.

But, when i try to import import org.apache.spark.streaming.kafka._ an error occur that Cannot resolve symbol kafka so what should i do to import this library


Solution

  • Depending on your Spark and Scala version you need to include the spark-kafka integration library to your dependencies.

    Spark Structured Streaming

    If you plan to use Spark Structured Streaming you need to add the following to your dependencies as described here:

    For Scala/Java applications using SBT/Maven project definitions, link your application with the following artifact:

    groupId = org.apache.spark
    artifactId = spark-sql-kafka-0-10_2.12
    version = 3.0.1
    

    Please note that to use the headers functionality, your Kafka client version should be version 0.11.0.0 or up. For Python applications, you need to add this above library and its dependencies when deploying your application. See the Deploying subsection below. For experimenting on spark-shell, you need to add this above library and its dependencies too when invoking spark-shell. Also, see the Deploying subsection below.

    Spark Streaming

    If you plan to work Spark Streaming (Direct API) you can follow the guidance given here:

    For Scala/Java applications using SBT/Maven project definitions, link your streaming application with the following artifact (see Linking section in the main programming guide for further information).

    groupId = org.apache.spark
    artifactId = spark-streaming-kafka-0-10_2.12
    version = 3.0.1