apache-sparkcassandraspark-cassandra-connectorspark-shell

Spark-shell does not import specified jar file


I am a complete beginner to all this stuff in general so pardon if I'm missing some totally obvious step. I installed spark 3.1.2 and cassandra 3.11.11 and I'm trying to connect both of them through this guide I found where I made a fat jar for execution. In the link I posted when they execute the spark-shell command with the jar file, there's a line which occurs at the start.

INFO SparkContext: Added JAR file:/home/chbatey/dev/tmp/spark-cassandra-connector/spark-cassandra-connector-java/target/scala-2.10/spark-cassandra-connector-java-assembly-1.2.0-SNAPSHOT.jar at http://192.168.0.34:51235/jars/spark-15/01/26 16:16:10 INFO SparkILoop: Created spark context..

I followed all of the steps properly but it doesn't show any line like that in my shell. To confirm that it hasn't been added I try the sample program on that website and it throws an error

java.lang.NoClassDefFoundError: com/datastax/spark/connector/util/Logging

What should I do? I'm using spark-cassandra-connector-3.1.0


Solution

  • You don't need to compile it yourself, just follow official documentation - use --packages to automatically download all dependencies:

    spark-shell --packages com.datastax.spark:spark-cassandra-connector_2.12:3.1.0
    

    Your error is that connector file doesn't contain dependencies, you need to list all things, like, java driver, etc. So if you still want to use --jars option, then just download assembly version of it (link to jar) - it will contain all necessary dependencies.