I have read that Spark does not have Prometheus as one of the pre-packaged sinks. So I found this post on how to monitor Apache Spark with prometheus.
But I found it difficult to understand and to success because I am beginner and this is a first time to work with Apache Spark.
First thing that I do not get is what I need to do?
I need to change the metrics.properties
Should I add some code in the app or?
I do not get what are the steps to make it...
The thing that I am making is: changing the properties like in the link, write this command:
--conf spark.metrics.conf=<path_to_the_file>/metrics.properties
And what else I need to do to see metrics from Apache spark?
Also I found this links: Monitoring Apache Spark with Prometheus
https://argus-sec.com/monitoring-spark-prometheus/
But I could not make it with it too...
I have read that there is a way to get metrics from Graphite and then to export them to Prometheus but I could not found some useful doc.
There are few ways to monitoring Apache Spark with Prometheus.
One of the way is by JmxSink + jmx-exporter
In the following command, the jmx_prometheus_javaagent-0.3.1.jar
file and the spark.yml
are downloaded in previous steps. It might need be changed accordingly.
bin/spark-shell --conf "spark.driver.extraJavaOptions=-javaagent:jmx_prometheus_javaagent-0.3.1.jar=8080:spark.yml"
After running, we can access with localhost:8080/metrics
It can then configure prometheus to scrape the metrics from jmx-exporter.
NOTE: We have to handle to discovery part properly if it's running in a cluster environment.