I know the Enterprise (Cloudera for example) way, by using a CM (via browser) or by Cloudera REST API one can access monitoring and configuring facilities.
But how to schedule (run and rerun) flume agents livecycle, and monitor their running/failure status without CM? Are there such things in the Flume distribution?
I tried adding flume.monitoring.type/port
to flume-ng
on start. And it completely fits my needs.
Lets create a simple agent a1
for example. Which listens on localhost:44444
and logs to console as a sink:
# flume.conf
a1.sources = s1
a1.channels = c1
a1.sinks = d1
a1.sources.s1.channels = c1
a1.sources.s1.type = netcat
a1.sources.s1.bind = localhost
a1.sources.s1.port = 44444
a1.sinks.d1.channel = c1
a1.sinks.d1.type = logger
a1.channels.c1.type = memory
a1.channels.c1.capacity = 100
a1.channels.c1.transactionCapacity = 10
Run it with additional parameters flume.monitoring.type/port
:
flume-ng agent -n a1 -c conf -f flume.conf -Dflume.root.logger=INFO,console -Dflume.monitoring.type=http -Dflume.monitoring.port=44123
And then monitor output in browser at localhost:44123/metrics
{"CHANNEL.c1":{"ChannelCapacity":"100","ChannelFillPercentage":"0.0","Type":"CHANNEL","EventTakeSuccessCount":"570448","ChannelSize":"0","EventTakeAttemptCount":"570573","StartTime":"1567002601836","EventPutAttemptCount":"570449","EventPutSuccessCount":"570448","StopTime":"0"}}
Just try some load:
dd if=/dev/urandom count=1024 bs=1024 | base64 | nc localhost 44444