I have a logs table with id
, name
, type
, and date
columns. And I want to create a new column which represents next id
value from the 2nd type
. Query must be grouped by name and only type1 values filled based on next type2 value.
logs table:
id | name | type | date |
---|---|---|---|
1 | name1 | type1 | 2022-01-01 |
2 | name1 | type1 | 2022-01-02 |
3 | name1 | type2 | 2022-01-03 |
4 | name1 | type1 | 2022-01-03 |
5 | name1 | type2 | 2022-01-04 |
6 | name1 | type1 | 2022-01-05 |
7 | name2 | type1 | 2022-01-03 |
8 | name2 | type2 | 2022-01-08 |
desirable output:
id | name | type | date | type2_id |
---|---|---|---|---|
1 | name1 | type1 | 2022-01-01 | 3 |
2 | name1 | type1 | 2022-01-02 | 3 |
4 | name1 | type1 | 2022-01-03 | 5 |
6 | name1 | type1 | 2022-01-05 | |
7 | name2 | type1 | 2022-01-03 | 8 |
I was able to do it using joins and LAG function. However, maybe someone could help me make it more efficient?
WITH logs AS (
SELECT 1 AS id, 'name1' AS name, 'type1' AS type, '2022-01-01' AS date,
UNION ALL
SELECT 2 AS id, 'name1' AS name, 'type1' AS type, '2022-01-02' AS date,
UNION ALL
SELECT 3 AS id, 'name1' AS name, 'type2' AS type, '2022-01-03' AS date,
UNION ALL
SELECT 4 AS id, 'name1' AS name, 'type1' AS type, '2022-01-03' AS date,
UNION ALL
SELECT 5 AS id, 'name1' AS name, 'type2' AS type, '2022-01-04' AS date,
UNION ALL
SELECT 6 AS id, 'name1' AS name, 'type1' AS type, '2022-01-05' AS date,
UNION ALL
SELECT 7 AS id, 'name2' AS name, 'type1' AS type, '2022-01-03' AS date,
UNION ALL
SELECT 8 AS id, 'name2' AS name, 'type2' AS type, '2022-01-08' AS date,
)
SELECT
t1.id,
t1.name,
t1.type,
t1.date,
t2.id AS type2_id,
FROM (
SELECT *,
FROM logs
WHERE logs.type = 'type1'
) AS t1
LEFT JOIN (
SELECT *,
IFNULL(LAG(logs.date) OVER(PARTITION BY logs.name, logs.type ORDER BY logs.date), '2000-01-1') AS date_prev,
FROM logs
WHERE type = 'type2'
) AS t2
ON t2.name = t1.name
AND t2.date > t1.date
AND t2.date_prev <= t1.date
ORDER BY t1.name, t1.date
Consider below approach
select *, first_value(if(type='type2', id, null) ignore nulls) over win as type2_id
from logs
qualify type = 'type1'
window win as (partition by name order by date rows between 1 following and unbounded following)
if applied to sample data in your question - output is