I am using Prometheus to query metrics from Apache Flink. I want to measure the number of records In and Out per second of a Map function. When I query two different metrics in Prometheus, the chart only shows one of them.
flink_taskmanager_job_task_operator_numRecordsInPerSecond{operator_name="Map"}
or flink_taskmanager_job_task_operator_numRecordsOutPerSecond{operator_name="Map"}
Does not matter if I change the operator or
to and
. The chart shows only the first (flink_taskmanager_job_task_operator_numRecordsInPerSecond
). I also have tried to edit the Prometheus config file /etc/prometheus/prometheus.yml
but I don't have too much experience on Prometheus and there is something wrong in my configuration. I was basing my solution on this post.
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9100']
- job_name: 'flink'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9250', 'localhost:9251', '192.168.56.20:9250']
metrics_path: /
# HOW TO ADD THE OPERATOR NAME ON THE METRIC NAME?
metric_relabel_configs:
- source_labels: [__name__]
regex: '(flink_taskmanager_job_task_operator)_(\w+)'
replacement: '${2}'
target_label: pool
- source_labels: [__name__]
regex: '(flink_taskmanager_job_task_operator)_(\w+)'
replacement: '${1}_bytes'
target_label: __name__
It is possible to select multiple metric names with a single PromQL query by using a regular expression filter on __name__
label:
{__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}
See docs about the __name__
label here.
There is another solution when using Prometheus-compatible query engine such as MetricsQL by using union function:
union(
flink_taskmanager_job_task_operator_numRecordsInPerSecond{operator_name="Map"},
flink_taskmanager_job_task_operator_numRecordsOutPerSecond{operator_name="Map"}
)
Note that selecting multiple time series via __name__
regexp can result in vector cannot contain metrics with the same labelset
error if the selected series are wrapped in any PromQL function. For example:
max_over_time(
{__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}[5m]
)
This is because Prometheus removes metric names from input series when applying PromQL functions. MetricsQL from VictoriaMetrics provides a solution for this issue - keep_metric_names
modifier (see these docs for details):
max_over_time(
{__name__=~"flink_taskmanager_job_task_operator_numRecords(In|Out)PerSecond",operator_name="Map"}[5m]
)
keep_metric_names
P.S. I work on VictoriaMetrics and MetricsQL.