When running spark-shell
it creates a file derby.log
and a folder metastore_db
. How do I configure spark to put these somewhere else?
For derby log I've tried Getting rid of derby.log like so spark-shell --driver-memory 10g --conf "-spark.driver.extraJavaOptions=Dderby.stream.info.file=/dev/null"
with a couple of different properties but spark ignores them.
Does anyone know how to get rid of these or specify a default directory for them?
The use of the hive.metastore.warehouse.dir
is deprecated since Spark 2.0.0,
see the docs.
As hinted by this answer, the real culprit for both the metastore_db
directory and the derby.log
file being created in every working subdirectory is the derby.system.home
property defaulting to .
.
Thus, a default location for both can be specified by adding the following line to spark-defaults.conf
:
spark.driver.extraJavaOptions -Dderby.system.home=/tmp/derby
where /tmp/derby
can be replaced by the directory of your choice.