prometheus grafana transformation metrics promql

How to get the label of "boolean" metrics from PromQL to Grafana Vizualization

Background

The main problem is working with these style of metrics:

some_state_metric{name="something", state="active"} = 1
some_state_metric{name="something", state="inactive"} = 0
some_state_metric{name="something", state="failing"} = 0

I don't know what they are called, but I refer to them as "boolean metrics" or "state metrics"

They track the state of something overtime by using 1 and 0 as booleans to indicate if the state is active or not.

Why do they exist?

As far as I can tell, this is used to work around the fact that PromQL/Prometheus don't have valid text/enum values for metrics, only numbers.
The two options an implementer has, is to have the value represent a state... (as in 0=null, 1=active, 2=inactive, 3=failing)

The problem with that solution is that it doesn't translate well to any PromQL queries that use labels as important partitioning indicators.

...Or, as seems to be the norm, representing each state as a separate label and then using a boolean value to indicate which state is active.

This can also be used to represent "combined" states.

Examples of This in the wild

Unix Node Exporter's systemd node_systemd_unit_state
Kubelet Pod Phase kube_pod_status_phase

Can't use em'

Sadly there is no native way in prometheus to represent these in a useful format. See here

I wish there was a special function that would work with this type of type of metric:

group(boolean_metric(some_state_metric, "state")) by(name)

resulting in the data-frame of:

vector<something>(1708277100, "active")
vector<something>(1708277101, "active")
vector<something>(1708277102, "active")
vector<something>(1708277103, "active")
vector<something>(1708277104, "failing")

Don't take this data-frame representation as gospel,
I know prometheus works on the vector principle, so there needs to be a direction (time i think) and a magnitude (the value I think) it then groups them with a specific tag
but i have no idea how to convey that information, or what it looks like internally

It basically returns the value of the label (state) where every value is equal to 1, partitioned on the instance label.

But no, this doesn't exist.

Why?

I want to use the state timeline and the status history visualization to represent the states.

The problem is both of these have no way to specify an alternate Value field.
They also expect the data-frame (time series) format, so I can't use tables really.

They both expect a unique discrete value returned by a query to map to a state, so 1 and 0 don't really work.

Experiments

I did try to do something with multiplying a given state by an offset:

some_state_metric{state="active"} * 10
  or
some_state_metric{state="inactive"} * 100
  or
some_state_metric{state="failing"} * 1000

You could then map 10 -> active and 100 -> inactive and 1000 -> failing

But I struggled with the combining of the queries, (I think I am doing it wrong). this is also "bad" because it involves multiple queries when, really, 1 should do.

Question

How do I convert a prometheus "state metric" / "boolean metric" into a format that can be used with the Grafana state timeline or the status history?

Solution

Figured out something

With the new partition by values transformation!

I'll be using the node_systemd_unit_state from the systemd module of the node exporter as an example.

Time Series Update

My previous solution used a Table query and then Transformed it into a state timeline.

There are some problems with the time spans needing a threshold to function properly, decreasing the acuracy of the exact end time, start time.

This solution keeps the time series.

Steps

Query

First step is to group the query (sum should work too):

group(node_systemd_unit_state{} == 1) by(name, state)

Legend: Auto
Format: Time Series
Type: Range

Transformation

First we convert all the labels (name, state) into fields, leaving us with (Time, name, state, Value)
Next, we merge all frames into one (the L2F transformation does something weird with the frame names)
Next, we need to drop the Value field. We don't need it.
Next, we need to sort by time, otherwise the state timeline will get confused.
Finally, we partition everything back into Frames (with labels, because the state timeline is a little dumb, see below)

v11 Transformations

Labels To Fields
Mode: Columns
Labels: (name, state)
Merge series/tables
Filter fields by name
From Variable: false
Identifier: /^(?!Value)/
Sort By
Field: Time
Partition by values
Field: (name)
Naming: As Label
Keep Fields: No

Visualization

Now just set the colours of your "Value Mappings" to the desired representation.

Set the standard options Display Name to ${__field.labels.name}

the state timeline butchers the name of the frames becuase it isn't the Value field, but the state field.