apache-sparkspark-streamingopentsdb

Sending Spark streaming metrics to open tsdb


How can I send metrics from my spark streaming job to open tsdb database? I am trying to use open tsdb as data source in Grafana. Can you please help me with some references where I can start.

I do see open tsdb reporter here which does similar job. How can I integrate the metrics from Spark streaming job to use this? Is there any easy options to do it.


Solution

  • One way to send the metrics to opentsdb is to use it's REST API. To use it, simply convert the metrics to JSON strings and then utilize the Apache Http Client library to send the data (it's in java and can therefore be used in scala). Example code can be found on github.


    A more elegant solution would be to use the Spark metrics library and add a sink to the database. There has been a discussion on adding an OpenTSDB sink for the Spark metrics library, however, finally it was not added into Spark itself. The code is avaiable on github and should be possible to use. Unfortunalty the code is compatible on Spark 1.4.1, however, in worst case it should still be possible to get some indications of what is necessary to add.