prometheusjmx-exporter

Prometheus JMX exporter with context deadline exceeded


I have successfully enabled monitoring for node_exporter but JMX_exporter is failing

I am able to get an output via Curl for jmx_metrics endpoint (http://localhost:55555/testsvr2/jmx_exporter/metrics) with response time of less than a second (I have attached the output below) but Prometheus is displaying status as "DOWN" with message "context deadline exceeded".

Here is the prometheus config I am using to monitor a server.

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: testsvr2_node
    scrape_interval: 5s
    metrics_path: /testsvr2/node_exporter/metrics
    static_configs:
      - targets: ['localhost:55555']
  - job_name: testsvr2_jmx
    scrape_interval: 20s
    metrics_path: /testsvr2/jmx_exporter/metrics
    static_configs:
      - targets: ['localhost:55555']

JMX exporter curl output:

# HELP jvm_buffer_pool_used_bytes Used bytes of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_bytes gauge
jvm_buffer_pool_used_bytes{pool="direct",} 246744.0
jvm_buffer_pool_used_bytes{pool="mapped",} 0.0
# HELP jvm_buffer_pool_capacity_bytes Bytes capacity of a given JVM buffer pool.
# TYPE jvm_buffer_pool_capacity_bytes gauge
jvm_buffer_pool_capacity_bytes{pool="direct",} 246744.0
jvm_buffer_pool_capacity_bytes{pool="mapped",} 0.0
# HELP jvm_buffer_pool_used_buffers Used buffers of a given JVM buffer pool.
# TYPE jvm_buffer_pool_used_buffers gauge
jvm_buffer_pool_used_buffers{pool="direct",} 30.0
jvm_buffer_pool_used_buffers{pool="mapped",} 0.0
# HELP jvm_memory_bytes_used Used bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_used gauge
jvm_memory_bytes_used{area="heap",} 4.98246352E8
jvm_memory_bytes_used{area="nonheap",} 2.76580424E8
# HELP jvm_memory_bytes_committed Committed (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_committed gauge
jvm_memory_bytes_committed{area="heap",} 6.33339904E8
jvm_memory_bytes_committed{area="nonheap",} 3.96230656E8
# HELP jvm_memory_bytes_max Max (bytes) of a given JVM memory area.
# TYPE jvm_memory_bytes_max gauge
jvm_memory_bytes_max{area="heap",} 3.817865216E9
jvm_memory_bytes_max{area="nonheap",} 1.124073472E9
# HELP jvm_memory_bytes_init Initial bytes of a given JVM memory area.
# TYPE jvm_memory_bytes_init gauge
jvm_memory_bytes_init{area="heap",} 2.59995072E8
jvm_memory_bytes_init{area="nonheap",} 2.4576E7
# HELP jvm_memory_pool_bytes_used Used bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_used gauge
jvm_memory_pool_bytes_used{pool="Code Cache",} 2.1598784E7
jvm_memory_pool_bytes_used{pool="PS Eden Space",} 8.0618168E7
jvm_memory_pool_bytes_used{pool="PS Survivor Space",} 2097152.0
jvm_memory_pool_bytes_used{pool="PS Old Gen",} 4.15531032E8
jvm_memory_pool_bytes_used{pool="PS Perm Gen",} 2.5498164E8
# HELP jvm_memory_pool_bytes_committed Committed bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_committed gauge
jvm_memory_pool_bytes_committed{pool="Code Cache",} 2.1889024E7
jvm_memory_pool_bytes_committed{pool="PS Eden Space",} 8.8604672E7
jvm_memory_pool_bytes_committed{pool="PS Survivor Space",} 2097152.0
jvm_memory_pool_bytes_committed{pool="PS Old Gen",} 5.4263808E8
jvm_memory_pool_bytes_committed{pool="PS Perm Gen",} 3.74341632E8
# HELP jvm_memory_pool_bytes_max Max bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_max gauge
jvm_memory_pool_bytes_max{pool="Code Cache",} 5.0331648E7
jvm_memory_pool_bytes_max{pool="PS Eden Space",} 1.42606336E9
jvm_memory_pool_bytes_max{pool="PS Survivor Space",} 2097152.0
jvm_memory_pool_bytes_max{pool="PS Old Gen",} 2.863136768E9
jvm_memory_pool_bytes_max{pool="PS Perm Gen",} 1.073741824E9
# HELP jvm_memory_pool_bytes_init Initial bytes of a given JVM memory pool.
# TYPE jvm_memory_pool_bytes_init gauge
jvm_memory_pool_bytes_init{pool="Code Cache",} 2555904.0
jvm_memory_pool_bytes_init{pool="PS Eden Space",} 6.6060288E7
jvm_memory_pool_bytes_init{pool="PS Survivor Space",} 1.048576E7
jvm_memory_pool_bytes_init{pool="PS Old Gen",} 1.7301504E8
jvm_memory_pool_bytes_init{pool="PS Perm Gen",} 2.2020096E7
# HELP tomcat_errorcount_total Tomcat global errorCount
# TYPE tomcat_errorcount_total counter
tomcat_errorcount_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_errorcount_total{port="8080",protocol="http-nio",} 792.0
# HELP tomcat_threadpool_connectioncount Tomcat threadpool connectionCount
# TYPE tomcat_threadpool_connectioncount gauge
tomcat_threadpool_connectioncount{port="8009",protocol="ajp-bio",} 1.0
tomcat_threadpool_connectioncount{port="8080",protocol="http-nio",} 1.0
# HELP tomcat_threadpool_pollerthreadcount Tomcat threadpool pollerThreadCount
# TYPE tomcat_threadpool_pollerthreadcount gauge
tomcat_threadpool_pollerthreadcount{port="8080",protocol="http-nio",} 2.0
# HELP tomcat_processingtime_total Tomcat global processingTime
# TYPE tomcat_processingtime_total counter
tomcat_processingtime_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_processingtime_total{port="8080",protocol="http-nio",} 11878.0
# HELP tomcat_bytessent_total Tomcat global bytesSent
# TYPE tomcat_bytessent_total counter
tomcat_bytessent_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_bytessent_total{port="8080",protocol="http-nio",} 8548511.0
# HELP tomcat_maxtime_total Tomcat global maxTime
# TYPE tomcat_maxtime_total counter
tomcat_maxtime_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_maxtime_total{port="8080",protocol="http-nio",} 1583.0
# HELP tomcat_bytesreceived_total Tomcat global bytesReceived
# TYPE tomcat_bytesreceived_total counter
tomcat_bytesreceived_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_bytesreceived_total{port="8080",protocol="http-nio",} 43847.0
# HELP tomcat_threadpool_currentthreadsbusy Tomcat threadpool currentThreadsBusy
# TYPE tomcat_threadpool_currentthreadsbusy gauge
tomcat_threadpool_currentthreadsbusy{port="8009",protocol="ajp-bio",} 0.0
tomcat_threadpool_currentthreadsbusy{port="8080",protocol="http-nio",} 0.0
# HELP tomcat_requestcount_total Tomcat global requestCount
# TYPE tomcat_requestcount_total counter
tomcat_requestcount_total{port="8009",protocol="ajp-bio",} 0.0
tomcat_requestcount_total{port="8080",protocol="http-nio",} 862.0
# HELP tomcat_threadpool_currentthreadcount Tomcat threadpool currentThreadCount
# TYPE tomcat_threadpool_currentthreadcount gauge
tomcat_threadpool_currentthreadcount{port="8009",protocol="ajp-bio",} 0.0
tomcat_threadpool_currentthreadcount{port="8080",protocol="http-nio",} 25.0
# HELP tomcat_threadpool_keepalivecount Tomcat threadpool keepAliveCount
# TYPE tomcat_threadpool_keepalivecount gauge
tomcat_threadpool_keepalivecount{port="8080",protocol="http-nio",} 0.0
# HELP jmx_scrape_duration_seconds Time this JMX scrape took, in seconds.
# TYPE jmx_scrape_duration_seconds gauge
jmx_scrape_duration_seconds 0.201767373
# HELP jmx_scrape_error Non-zero if this scrape failed.
# TYPE jmx_scrape_error gauge
jmx_scrape_error 0.0
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 329.21
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.540210335811E9
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 202.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 7.924580352E9
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 9.93017856E8
# HELP jmx_config_reload_success_total Number of times configuration have successfully been reloaded.
# TYPE jmx_config_reload_success_total counter
jmx_config_reload_success_total 0.0
# HELP jvm_threads_current Current thread count of a JVM
# TYPE jvm_threads_current gauge
jvm_threads_current 118.0
# HELP jvm_threads_daemon Daemon thread count of a JVM
# TYPE jvm_threads_daemon gauge
jvm_threads_daemon 61.0
# HELP jvm_threads_peak Peak thread count of a JVM
# TYPE jvm_threads_peak gauge
jvm_threads_peak 119.0
# HELP jvm_threads_started_total Started thread count of a JVM
# TYPE jvm_threads_started_total counter
jvm_threads_started_total 130.0
# HELP jvm_threads_deadlocked Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers
# TYPE jvm_threads_deadlocked gauge
jvm_threads_deadlocked 0.0
# HELP jvm_threads_deadlocked_monitor Cycles of JVM-threads that are in deadlock waiting to acquire object monitors
# TYPE jvm_threads_deadlocked_monitor gauge
jvm_threads_deadlocked_monitor 0.0
# HELP jmx_config_reload_failure_total Number of times configuration have failed to be reloaded.
# TYPE jmx_config_reload_failure_total counter
jmx_config_reload_failure_total 0.0
# HELP jvm_info JVM version info
# TYPE jvm_info gauge
jvm_info{version="1.7.0_80-b15",vendor="Oracle Corporation",runtime="Java(TM) SE Runtime Environment",} 1.0
# HELP jvm_gc_collection_seconds Time spent in a given JVM garbage collector in seconds.
# TYPE jvm_gc_collection_seconds summary
jvm_gc_collection_seconds_count{gc="PS Scavenge",} 458.0
jvm_gc_collection_seconds_sum{gc="PS Scavenge",} 5.806
jvm_gc_collection_seconds_count{gc="PS MarkSweep",} 3.0
jvm_gc_collection_seconds_sum{gc="PS MarkSweep",} 1.192
# HELP jvm_classes_loaded The number of classes that are currently loaded in the JVM
# TYPE jvm_classes_loaded gauge
jvm_classes_loaded 37664.0
# HELP jvm_classes_loaded_total The total number of classes that have been loaded since the JVM has started execution
# TYPE jvm_classes_loaded_total counter
jvm_classes_loaded_total 37664.0
# HELP jvm_classes_unloaded_total The total number of classes that have been unloaded since the JVM has started execution
# TYPE jvm_classes_unloaded_total counter
jvm_classes_unloaded_total 0.0

P.S: I have unsuccessfully tried increasing scrape interval to 30seconds as suggested in other sources with same error (even though I am getting an output within a second for the metric endpoint with curl).


Solution

  • First, let me give the solution: Issue was resolved with updating nginx at the destination (i.e client) to use proxy_http_version 1.1.

    Let me explain my setup so that we understand why nginx is needed in first place and how I arrived at the solution.

    Prometheus monitoring setup

    Note:

    As the per the prometheus scrape config:

    How did I arrive at the Nginx as the issue source.

    I added one of the client to directly accessible from Prometheus (For random test) and changed the prometheus scrape config to

    global:
    
      scrape_interval: 15s
    
    scrape_configs:
      - job_name: testsvr2_node
        scrape_interval: 5s
        metrics_path: /node_exporter/metrics
        static_configs:
          - targets: ['testsvr2']
      - job_name: testsvr2_jmx
        scrape_interval: 20s
        metrics_path: /jmx_exporter/metrics
        static_configs:
          - targets: ['testsvr2']
    

    After the above config change, I started getting different error i.e. "unexpected EOF" and started researching on how resolve the issue and then get the end result.

    The error message changed from "context deadline exceeded" to "unexpected EOF" only then I could get to the solution.

    Hope this is helpful to someone with similar architecture with not so helpful error message.