I trying to start spark-shell having packages variable set by an environment variable by default.
The normal execution command is
spark-shell --packages com.databricks:spark-csv_2.11:1.3.0
I would like to avoid to write always --packages com.databricks:spark-csv_2.11:1.3.0
set a variable
Which is the variable that can I set in order to do that?
You can add line
spark.jars.packages com.databricks:spark-csv_2.11:1.3.0
into your spark configuration file:
$SPARK_HOME/conf/spark-defaults.conf
Note: this will affect any spark application, not only spark-shell.
Look more in the spark documentation.