apache-sparkspark-shell

Set default packages variable for spark-shell


I trying to start spark-shell having packages variable set by an environment variable by default.

The normal execution command is spark-shell --packages com.databricks:spark-csv_2.11:1.3.0

I would like to avoid to write always --packages com.databricks:spark-csv_2.11:1.3.0 set a variable

Which is the variable that can I set in order to do that?


Solution

  • You can add line

    spark.jars.packages  com.databricks:spark-csv_2.11:1.3.0
    

    into your spark configuration file:

    $SPARK_HOME/conf/spark-defaults.conf

    Note: this will affect any spark application, not only spark-shell.
    Look more in the spark documentation.