I have a PostgreSQL table that represents the time-bound associations of different sensors with different labels, using a SCD-2 approach. The table structure is as follows:
CREATE TABLE SensorLabel (
sensor_id INT,
label_id INT,
start_time TIMESTAMPTZ,
end_time TIMESTAMPTZ
);
Each row in this table represents that a sensor (sensor_id) is associated with a label (label_id) during a specific time period [start_time to end_time) (so, resp. inclusive, exclusive).
Now I have a problem where I need to find all time range overlaps for a set of labels. That is, I have a set of labels {label1, label2, ..., labelN} and I want to find all time ranges when the sensor was associated with each of these labels at the same time.
Please note that the time range for each label association could be different, so the intersection of these ranges may break up into smaller time-ranges. I want to return only those time-ranges during which the sensor is associated with all the given labels.
I can solve for say a static 2 labels or 3 labels but have problems wrapping this in a generic SQL that solves for a variable N labels
EDIT: some sample input and output
sensor|label|from|to
1|1|2021-01-01|2021-10-01
1|2|2020-12-01|2021-05-01
1|2|2021-07-01|2021-09-01
1|3|2021-03-01|2021-06-01
1|3|2021-08-01|2021-12-01
Ouput: i.e.: time-ranges where (time-ranges for) label 1,2,3, overlap:
sensor|from|to
1|2021-03-01|2021-05-01
1|2021-08-01|2021-09-01
The only way I could do it is with a function:
CREATE TABLE sensorlabel (
sensor_id INT,
label_id INT,
start_time TIMESTAMPTZ,
end_time TIMESTAMPTZ
);
insert into sensorlabel values
(1,1,'2021-01-01','2021-10-01'),
(1,2,'2020-12-01','2021-05-01'),
(1,2,'2021-07-01','2021-09-01'),
(1,3,'2021-03-01','2021-06-01'),
(1,3,'2021-08-01','2021-12-01');
CREATE OR REPLACE FUNCTION public.tsrange_fnc(l_ids integer[], s_id integer)
RETURNS SETOF tstzrange
LANGUAGE plpgsql
AS $function$
DECLARE
_tsrange tstzrange;
_prevrange tstzrange := NULL;
_testrange tstzrange;
BEGIN
for _tsrange in select
tstzrange(start_time, end_time)
from
sensorlabel
where
label_id = ANY(l_ids)
and
sensor_id = s_id
order by
start_time LOOP
IF _prevrange IS NULL THEN
_prevrange = _tsrange;
ELSE
_testrange = _tsrange * _prevrange;
IF _testrange = 'empty'::tstzrange THEN
RETURN NEXT _prevrange;
_prevrange = _tsrange;
ELSE
_prevrange = _tsrange * _prevrange;
END IF;
END IF;
END LOOP;
RETURN NEXT _prevrange;
END;
$function$
select tsrange_fnc(ARRAY[1, 2, 3], 1);
tsrange_fnc
-------------------------------------------------------
["03/01/2021 00:00:00 PST","05/01/2021 00:00:00 PDT")
["08/01/2021 00:00:00 PDT","09/01/2021 00:00:00 PDT")
select tsrange_fnc(ARRAY[1, 2], 1);
tsrange_fnc
-------------------------------------------------------
["01/01/2021 00:00:00 PST","05/01/2021 00:00:00 PDT")
["07/01/2021 00:00:00 PDT","09/01/2021 00:00:00 PDT")
select tsrange_fnc(ARRAY[2, 3], 1);
tsrange_fnc
-------------------------------------------------------
["03/01/2021 00:00:00 PST","05/01/2021 00:00:00 PDT")
["08/01/2021 00:00:00 PDT","09/01/2021 00:00:00 PDT")