prometheus

How to execute a query with two metrics in Prometheus?


I am using Prometheus to query metrics from Apache Flink. I want to measure the number of records In and Out per second of a Map function. When I query two different metrics in Prometheus, the chart only shows one of them.

flink_taskmanager_job_task_operator_numRecordsInPerSecond{operator_name="Map"} 
or flink_taskmanager_job_task_operator_numRecordsOutPerSecond{operator_name="Map"}

enter image description here Does not matter if I change the operator or to and. The chart shows only the first (flink_taskmanager_job_task_operator_numRecordsInPerSecond). I also have tried to edit the Prometheus config file /etc/prometheus/prometheus.yml but I don't have too much experience on Prometheus and there is something wrong in my configuration. I was basing my solution on this post.

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']
  - job_name: 'node_exporter'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9100']   
  - job_name: 'flink'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9250', 'localhost:9251', '192.168.56.20:9250']
    metrics_path: /
# HOW TO ADD THE OPERATOR NAME ON THE METRIC NAME?
    metric_relabel_configs:
      - source_labels: [__name__]
      regex: '(flink_taskmanager_job_task_operator)_(\w+)'
      replacement: '${2}'
      target_label: pool
      - source_labels: [__name__]
      regex: '(flink_taskmanager_job_task_operator)_(\w+)'
      replacement: '${1}_bytes'
      target_label: __name__

Solution

  • It is possible to select multiple metric names with a single PromQL query by using a regular expression filter on __name__ label:

    {__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}
    

    See docs about the __name__ label here.

    There is another solution when using Prometheus-compatible query engine such as MetricsQL by using union function:

    union(
     
     flink_taskmanager_job_task_operator_numRecordsInPerSecond{operator_name="Map"},
     
     flink_taskmanager_job_task_operator_numRecordsOutPerSecond{operator_name="Map"}
    )
    

    Note that selecting multiple time series via __name__ regexp can result in vector cannot contain metrics with the same labelset error if the selected series are wrapped in any PromQL function. For example:

    max_over_time(
     
     {__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}[5m]
    )
    

    This is because Prometheus removes metric names from input series when applying PromQL functions. MetricsQL from VictoriaMetrics provides a solution for this issue - keep_metric_names modifier (see these docs for details):

    max_over_time(
     
     {__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}[5m]
    )
    keep_metric_names
    

    P.S. I work on VictoriaMetrics and MetricsQL.