apache-sparksolrlucidworks

Enable Spark metric in LucidWorks Fusion


I'm attempting to add the Graphite sink to my Fusion Spark component.

I've created a file - ${FUSION_DIR}/apps/spark-dist/conf/metrics.properties with the contents -

# Enable Graphite
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.host=graphite-server
*.sink.graphite.port=2003
*.sink.graphite.period=10
*.sink.graphite.prefix=lab.$(hostname)

# Enable jvm source for instance master, worker, driver and executor
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource

And added the following to ${FUSION_DIR}/apps/spark-dist/bin/spark-submit -

exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit --files="${SPARK_HOME}"/conf/metrics.properties --conf spark.metrics.conf=metrics.properties  "$@"

But I see no metrics reported in Graphite, and no errors in the Spark logs.
Has anyone used the Spark metrics configuration in Fusion successfully?


Solution

  • I needed to add the full path to the --conf parameter in spark-submit -

    exec "${SPARK_HOME}"/bin/spark-class org.apache.spark.deploy.SparkSubmit --files="${SPARK_HOME}"/conf/metrics.properties --conf spark.metrics.conf="${SPARK_HOME}"/conf/metrics.properties "$@" .

    I hadnt seen an error when Spark master and workers processes started, but I did see an error when starting spark-shell which clued me into the configuration issue.