I needed to get the throughput of Heron Cluster for some reasons, but there is no metric in the Heron UI. So do you have any ideas about how to monitor the throughput of Heron Cluster? Thanks.
The result of running heron-explorer as follows:
yitian@heron01:~$ heron-explorer metrics aurora/yitian/devel SentenceWordCountTopology
[2018-08-03 21:02:09 +0000] [INFO]: Using tracker URL: http://127.0.0.1:8888
'spout' metrics:
container id jvm-uptime-secs jvm-process-cpu-load jvm-memory-used-mb emit-count ack-count fail-count
------------------- ----------------- ---------------------- -------------------- ------------ ----------- ------------
container_3_spout_6 2053 0.253257 146 1.13288e+07 1.13278e+07 0
container_4_spout_7 2091 0.150625 137.5 1.1624e+07 1.16228e+07 231
'count' metrics:
container id jvm-uptime-secs jvm-process-cpu-load jvm-memory-used-mb emit-count execute-count ack-count fail-count
-------------------- ----------------- ---------------------- -------------------- ------------ --------------- ----------- ------------
container_6_count_12 2092 0.184742 155.167 0 4.6026e+07 4.6026e+07 0
container_5_count_9 2091 0.387867 146 0 4.60069e+07 4.60069e+07 0
container_6_count_11 2092 0.184488 157.833 0 4.58158e+07 4.58158e+07 0
container_4_count_8 2091 0.443688 129.833 0 4.58722e+07 4.58722e+07 0
container_5_count_10 2091 0.382577 118.5 0 4.60091e+07 4.60091e+07 0
'split' metrics:
container id jvm-uptime-secs jvm-process-cpu-load jvm-memory-used-mb emit-count execute-count ack-count fail-count
------------------- ----------------- ---------------------- -------------------- ------------ --------------- ----------- ------------
container_1_split_2 2091 0.143034 75.3333 4.59453e+07 4.59453e+06 4.59453e+06 0
container_3_split_5 2042 1.12248 79.1667 4.64862e+07 4.64862e+06 4.64862e+06 0
container_2_split_3 2150 0.139837 83.6667 4.59443e+07 4.59443e+06 4.59443e+06 0
container_1_split_1 2091 0.145702 104.167 4.59454e+07 4.59454e+06 4.59454e+06 0
container_2_split_4 2150 0.138453 106.333 4.59443e+07 4.59443e+06 4.59443e+06 0
[2018-08-03 21:02:09 +0000] [INFO]: Elapsed time: 0.031s.
You can use the execute-count
of you sink component to measure the output of your topology. If each of your components have a 1:1 input:output ratio then this will be your throughput.
However, if you are windowing tuples into batches or splitting tuples (like separating sentences into individual words) then things get a little more complicated. You can get the input into your topology by looking at the emit-count
of your spout components. You could then use this in comparison to you bolt execute-counts
to create your own throughput metric.
An easy way to get programmatic access to these metrics is via the Heron Tracker REST API. You can use your chosen language's HTTP library (like Requests for Python) to query the last 3 hours of data for a running topology. If you require more than 3 hours of data (the maximum stored by the topology TMaster) you will need to use one of the other metrics sinks to send metrics to an external database. Heron currently provides sinks for saving to local files, Graphite or Prometheus. InfluxDB support is in the works.