apache-sparkemr

getting & setting spark.driver/executor.extraClassPath on EMR


As far as I can tell, when setting / using spark.driver.extraClassPath and spark.executor.extraClassPath on AWS EMR within the spark-defaults.conf or elsewhere as a flag, I would have to first get the existing value that [...].extraClassPath is set to, then append :/my/additional/classpath to it in order for it to work.

is there a function in Spark that allows me to to just append an additional class path to it where it retains/respects the existing paths set by EMR in /etc/spark/conf/spark-defaults.conf?


Solution

  • No such "function" in Spark but: On EMR AMI's you can write a bootstrap that will append/set whatever you want in spark-defaults, will of course affect all Spark jobs.

    When EMR moved to the newer "release-label" this stopped working as bootstrap-steps were replaced with configuration JSONs and the manual bootstraps run before applications are installed ( At least when I tried it )