sudopidtelegraftelegraf-inputs-pluginprocstat

Telegraf - inputs.procstat procstat Plugin - README.md doc - exe, pid_file, command line pattern username


Using: Telegraf v1.0.1

Telegraf procstat plugin's documentation: https://github.com/influxdata/telegraf/tree/master/plugins/inputs/procstat

My custom config File:
/etc/telegraf/telegraf.d/my_custom_process_service-telegraf.conf contains:

[[inputs.procstat]]
  exe = "."
  prefix = "service_process"

[[inputs.procstat]]
  pid_file = "/var/run/jenkins/jenkins.pid"
  prefix = "service_process"

The above configuration works fine per the syntax. This will give me metrics with metric's name starting: procstat.service.process.xx.xx(depending upon if you are converting _ with a . character) -or simply procstat.service_process.x.x metrics.

To catch any process running on the machine using exe = "." (it will do a pgrep "." operation and) to find all processes running on the machine by giving process_name=<processes> values; -OR using, pid_file = /var/run/jenkins/jenkins.pid (NOTE: Provided you have READ permission for the user which is running telegraf service) for processes which run behind Java/other wrappers; If you give pid_file = /var/run/jenkins/jenkins.pid and if Jenkins is running under user jenkins and /var/run/jenkins folder doesn't have at least "r-x" access + read "r" access on the pid file itself, then it'll will throw and error about "permission denied".

2017-01-10T18:13:30Z E! Error: procstat getting process, exe: [] pidfile: [/var/run/jenkins/jenkins.pid] pattern: [] user: [] Failed to read pidfile '/var/run/jenkins/jenkins.pid'. Error: 'open /var/run/jenkins/jenkins.pid: permission denied' 

Question:

Is it possible for Telegraf to run in SUDO mode (if possible)? i.e. if I don't have r-x/r access to read a process's PID file and assuming there are lots of such processes (running behind Java/some Wrapper, so exe=xxxx won't work in such cases), then I have to use pid_file = ... method, then how can I have Telegraf working with this pid_file method for getting the process_name as jenkins or nexus etc.

PS: Doing chmod -R 775_or_755 /var/run on every host may not be feasible.

If I do give 755 permission at /var/ran/jenkins folder and 644 to jenkins.pid file, the permission error will go away. After this I tried to use metric: procstat.service.process.cpu.usage against process jenkins (i.e. process_name="jenkins") but it's not finding jenkins as it's value. Did I miss anything?


Solution

  • Added the following config in /etc/telegraf/telegraf.d/someFile.conf and fixed the permission issue using Ansible's file module: http://docs.ansible.com/ansible/file_module.html

    ## Telegraf filestat plugin
    [[inputs.filestat]]
      files = ["/var/run/*/*.pid","/var/run/*.pid"]
    
    ## To catch all processs. Better than pattern = "."
    [[inputs.procstat]]
      exe = "."
      prefix = "pgrep_serviceprocess"
    
    ##For catching processes by a user.
    ## Telegraf will use: pgrep -u <user>
    [[inputs.procstat]]
      user = "vagrant"
      prefix = "pgrep_serviceprocess"
    
    [[inputs.procstat]]
      user = "telegraf"
      prefix = "pgrep_serviceprocess"
    
    [[inputs.procstat]]
      user = "root"
      prefix = "pgrep_serviceprocess"
    
    ## Add more users or template it out in Ansible.