From the table below, I am trying to create another column (last_k2X
) that will show the last key2 where type
was set to 'X'
over time (ti
).
In case X
was set at the same time (ts
) for multiple key2
, (in same key1
partition), the new column last_k2X
will have the key2
for all key2
in that partition/time.
Input:
key1 | key2 | ts | type |
---|---|---|---|
1 | A | t0 | |
1 | B | t1 | a |
1 | C | t1 | X |
1 | D | t2 | b |
1 | E | t3 | |
1 | F | t4 | c |
1 | G | t5 | X |
1 | H | t5 | |
1 | I | t6 | d |
I tried to use window functions like FIRST_VALUE() and LAG() but could not manage to have the right results. I expected the result like:
Expected output:
key1 | key2 | ts | type | last_k2X |
---|---|---|---|---|
1 | A | t0 | ||
1 | B | t1 | a | C |
1 | C | t1 | X | C |
1 | D | t2 | b | C |
1 | E | t3 | C | |
1 | F | t4 | c | C |
1 | G | t5 | X | G |
1 | H | t5 | G | |
1 | I | t6 | d | G |
As you didn't specify the database system you are using, this solution is in MySQL, but can be simply converted to other database systems.
This is basically a gaps and island problem, but need to get the right partitions
CREATE TABLE data
(key1 int, key2 varchar(1), ts varchar(2), type varchar(4))
;
INSERT INTO data
(key1, key2, ts, type)
VALUES
(1, 'A', 't0', NULL),
(1, 'B', 't1', 'a'),
(1, 'C', 't1', 'X'),
(1, 'D', 't2', 'b'),
(1, 'E', 't3', NULL),
(1, 'F', 't4', 'c'),
(1, 'G', 't5', 'X'),
(1, 'H', 't5', NULL),
(1, 'I', 't6', 'd')
;
WITH x_sel AS (SELECT key1, key2, ts,type, CASE WHEN type = 'X' then 1 ELSE 0 END rk
FROM data
), CTE2 as
(SELECT
x_sel.key1, x_sel.key2, x_sel.ts,x_sel.type,x_sel2.key2 k2X, SUM(rk) OVER (PARTITION BY x_sel.key1 ORDER BY x_sel.key2) s_rk FROM x_sel
LEFT JOIN (SELECT key1, key2, ts FROM data WHERE type = 'X') x_sel2
ON x_sel.key1 = x_sel2.key1 ANd x_sel.ts = x_sel2.ts)
SELECT key1, key2, ts,type,CASE WHEN s_rk = 0 THEn k2x ELSE COALESCE(k2x,MAX(k2X) OVER(PARTITION BY s_rk ORDER BY S_rk)) END k2x
FROM CTE2
key1 | key2 | ts | type | k2x |
---|---|---|---|---|
1 | A | t0 | null | null |
1 | B | t1 | a | C |
1 | C | t1 | X | C |
1 | D | t2 | b | C |
1 | E | t3 | null | C |
1 | F | t4 | c | C |
1 | G | t5 | X | G |
1 | H | t5 | null | G |
1 | I | t6 | d | G |