Using: Telegraf v1.0.1
Telegraf procstat plugin's documentation: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat
My custom config File:
/etc/telegraf/telegraf.d/my_custom_process_service-telegraf.conf contains:
[[inputs.procstat]]
exe = "."
prefix = "service_process"
[[inputs.procstat]]
pid_file = "/var/run/jenkins/jenkins.pid"
prefix = "service_process"
The above configuration works fine per the syntax. This will give me metrics with metric's name starting: procstat.service.process.xx.xx
(depending upon if you are converting _
with a .
character) -or simply procstat.service_process.x.x
metrics.
To catch any process running on the machine using exe = "."
(it will do a pgrep "."
operation and) to find all processes running on the machine by giving process_name=<processes>
values; -OR using, pid_file = /var/run/jenkins/jenkins.pid
(NOTE: Provided you have READ permission for the user which is running telegraf
service) for processes which run behind Java/other wrappers; If you give pid_file = /var/run/jenkins/jenkins.pid
and if Jenkins is running under user jenkins
and /var/run/jenkins folder doesn't have at least "r-x
" access + read "r
" access on the pid file itself, then it'll will throw and error about "permission denied".
2017-01-10T18:13:30Z E! Error: procstat getting process, exe: [] pidfile: [/var/run/jenkins/jenkins.pid] pattern: [] user: [] Failed to read pidfile '/var/run/jenkins/jenkins.pid'. Error: 'open /var/run/jenkins/jenkins.pid: permission denied'
Question:
Is it possible for Telegraf to run in SUDO
mode (if possible)? i.e. if I don't have r-x/r
access to read a process's PID file and assuming there are lots of such processes (running behind Java/some Wrapper, so exe=xxxx
won't work in such cases), then I have to use pid_file = ...
method, then how can I have Telegraf working with this pid_file
method for getting the process_name
as jenkins
or nexus
etc.
PS: Doing chmod -R 775_or_755 /var/run
on every host may not be feasible.
If I do give 755 permission at /var/ran/jenkins folder and 644 to jenkins.pid file, the permission error will go away. After this I tried to use metric: procstat.service.process.cpu.usage
against process jenkins
(i.e. process_name="jenkins"
) but it's not finding jenkins
as it's value. Did I miss anything?
Added the following config in /etc/telegraf/telegraf.d/someFile.conf and fixed the permission issue using Ansible's file module
: http://docs.ansible.com/ansible/file_module.html
## Telegraf filestat plugin
[[inputs.filestat]]
files = ["/var/run/*/*.pid","/var/run/*.pid"]
## To catch all processs. Better than pattern = "."
[[inputs.procstat]]
exe = "."
prefix = "pgrep_serviceprocess"
##For catching processes by a user.
## Telegraf will use: pgrep -u <user>
[[inputs.procstat]]
user = "vagrant"
prefix = "pgrep_serviceprocess"
[[inputs.procstat]]
user = "telegraf"
prefix = "pgrep_serviceprocess"
[[inputs.procstat]]
user = "root"
prefix = "pgrep_serviceprocess"
## Add more users or template it out in Ansible.