With the following table, I need to make a query to replace the null values of “time” (number of minutes, integer) by a value consistent with the preceding and following values, with partitioning by “esz” and sorting by ascending “rank”. In the example, the data is already sorted. Ideally:
id | esz | code | rank | time |
---|---|---|---|---|
1 | 1 | "SOM" | 1 | 5 |
2 | 1 | "QUI" | 2 | NULL |
3 | 1 | "VER" | 3 | 10 |
4 | 1 | "NSC" | 4 | 15 |
5 | 1 | "3SM" | 99 | NULL |
6 | 2 | "QUI" | 1 | 7 |
7 | 2 | "VER" | 2 | NULL |
8 | 2 | "SOM" | 3 | NULL |
9 | 2 | "NSC" | 4 | 12 |
10 | 2 | "3SM" | 99 | NULL |
11 | 3 | "NSC" | 1 | NULL |
12 | 3 | "VER" | 2 | NULL |
13 | 3 | "QUI" | 3 | 11 |
14 | 3 | "SOM" | 4 | 12 |
15 | 3 | "3SM" | 99 | NULL |
16 | 4 | "SOM" | 1 | 2 |
17 | 4 | "QUI" | 2 | NULL |
18 | 4 | "NSC" | 3 | 3 |
19 | 4 | "VER" | 4 | NULL |
20 | 4 | "3SM" | 99 | NULL |
21 | 5 | "NSC" | 1 | NULL |
22 | 5 | "SOM" | 2 | 4 |
23 | 5 | "VER" | 3 | NULL |
24 | 5 | "QUI" | 4 | 7 |
25 | 5 | "3SM" | 99 | NULL |
Sample output:
id | esz | code | rank | time |
---|---|---|---|---|
1 | 1 | "SOM" | 1 | 5 |
2 | 1 | "QUI" | 2 | 6 |
3 | 1 | "VER" | 3 | 10 |
4 | 1 | "NSC" | 4 | 15 |
5 | 1 | "3SM" | 99 | 16 |
6 | 2 | "QUI" | 1 | 7 |
7 | 2 | "VER" | 2 | 8 |
8 | 2 | "SOM" | 3 | 9 |
9 | 2 | "NSC" | 4 | 12 |
10 | 2 | "3SM" | 99 | 13 |
11 | 3 | "NSC" | 1 | 9 |
12 | 3 | "VER" | 2 | 10 |
13 | 3 | "QUI" | 3 | 11 |
14 | 3 | "SOM" | 4 | 12 |
15 | 3 | "3SM" | 99 | 13 |
16 | 4 | "SOM" | 1 | 2 |
17 | 4 | "QUI" | 2 | 2 |
18 | 4 | "NSC" | 3 | 3 |
19 | 4 | "VER" | 4 | 4 |
20 | 4 | "3SM" | 99 | 5 |
21 | 5 | "NSC" | 1 | 3 |
22 | 5 | "SOM" | 2 | 4 |
23 | 5 | "VER" | 3 | 5 |
24 | 5 | "QUI" | 4 | 7 |
25 | 5 | "3SM" | 99 | 8 |
I can handle most of the cases in the table with LAG() and LEAD() and CASE. But I still have problems with some of them: id=11 and id=12 for example. Either I have to go backwards from the value of id=13, or I have to know when filling id=11 that I have to leave a value available to fill id=12.
My last query:
SELECT
esz,
code,
rank,
CASE
WHEN time IS NOT NULL THEN time
ELSE
CASE
WHEN rank = 1 THEN 1
WHEN LAG(time) OVER(PARTITION BY esz ORDER BY rank) IS NOT NULL AND LEAD(time) OVER(PARTITION BY esz ORDER BY rank) IS NULL THEN LAG(time) OVER(PARTITION BY esz ORDER BY rank) + 1
WHEN LEAD(time) OVER(PARTITION BY esz ORDER BY rank) IS NOT NULL AND LAG(time) OVER(PARTITION BY esz ORDER BY rank) IS NULL THEN LEAD(time) OVER(PARTITION BY esz ORDER BY rank) - 1
WHEN LAG(time) OVER(PARTITION BY esz ORDER BY rank) IS NOT NULL AND LEAD(time) OVER(PARTITION BY esz ORDER BY rank) IS NOT NULL THEN
CASE
WHEN LEAD(time) OVER(PARTITION BY esz ORDER BY rank) - LAG(time) OVER(PARTITION BY esz ORDER BY rank) <= 1 THEN LAG(time) OVER(PARTITION BY esz ORDER BY rank)
ELSE LAG(time) OVER(PARTITION BY esz ORDER BY rank) +1
END
WHEN rank > LEAD(rank) OVER(PARTITION BY esz ORDER BY rank) OR LEAD(rank) OVER(PARTITION BY esz ORDER BY rank) IS NULL THEN 999
END
END AS time
FROM mytable
ORDER BY esz, rank
Here's the result:
id | esz | code | rank | time |
---|---|---|---|---|
1 | 1 | "SOM" | 1 | 5 |
2 | 1 | "QUI" | 2 | 6 |
3 | 1 | "VER" | 3 | 10 |
4 | 1 | "NSC" | 4 | 15 |
5 | 1 | "3SM" | 99 | 16 |
6 | 2 | "QUI" | 1 | 7 |
7 | 2 | "VER" | 2 | 8 |
8 | 2 | "SOM" | 3 | <11> |
9 | 2 | "NSC" | 4 | 12 |
10 | 2 | "3SM" | 99 | 13 |
11 | 3 | "NSC" | 1 | <1> |
12 | 3 | "VER" | 2 | 10 |
13 | 3 | "QUI" | 3 | 11 |
14 | 3 | "SOM" | 4 | 12 |
15 | 3 | "3SM" | 99 | 13 |
16 | 4 | "SOM" | 1 | 2 |
17 | 4 | "QUI" | 2 | 2 |
18 | 4 | "NSC" | 3 | 3 |
19 | 4 | "VER" | 4 | 4 |
20 | 4 | "3SM" | 99 | <999> |
21 | 5 | "NSC" | 1 | <1> |
22 | 5 | "SOM" | 2 | 4 |
23 | 5 | "VER" | 3 | 5 |
24 | 5 | "QUI" | 4 | 7 |
25 | 5 | "3SM" | 99 | 8 |
I've highlighted the values that differ from the expected result. It's not perfect, but there are no inconsistencies. However, there would be a problem if more than 3 consecutive overrides were inserted.
I found a way to resolved this issue correctly.
As there are two NULL
scenario in your inputs. So, design query according to that ...
select id, esz, code, ranks,
-- lag(times) over(partition by esz order by ranks) as prev_time,
-- lead(times) over(partition by esz order by ranks) as next_time,
times,
CASE
-- Previous || Current* || Next :: Scenario
-- ANY || NOT NULL* || ANY
WHEN(times is not NULL)
THEN times
-- NOT NULL || NULL* || NOT NULL
WHEN(times is NULL AND
(lag(times) over(partition by esz order by ranks)) is not NULL AND
(lead(times) over(partition by esz order by ranks)) is not NULL)
THEN CASE
WHEN((lag(times) over(partition by esz order by ranks) + 1) <
(lead(times) over(partition by esz order by ranks)) - 1)
THEN ((lag(times) over(partition by esz order by ranks)) + 1)
ELSE
((lead(times) over(partition by esz order by ranks)) - 1)
END
-- NOT NULL || NULL || NULL* || ANY (esz = 4 and 2)
WHEN(times is NULL AND
(lag(times) over(partition by esz order by ranks)) is NULL AND
(lag(times, 2) over(partition by esz order by ranks)) is not NULL)
THEN CASE
WHEN((lead(times) over(partition by esz order by ranks)) is NULL)
THEN (lag(times, 2) over(partition by esz order by ranks) + 2)
ELSE CASE
WHEN((lag(times, 2) over(partition by esz order by ranks) + 2) <
(lead(times) over(partition by esz order by ranks) - 1))
THEN (lag(times, 2) over(partition by esz order by ranks) + 2)
ELSE (lead(times) over(partition by esz order by ranks) - 1)
END
END
-- ANY || NULL* || NULL || NOT NULL (esz = 3)
WHEN(times is NULL AND
(lead(times) over(partition by esz order by ranks)) is NULL AND
(lead(times, 2) over(partition by esz order by ranks)) is not NULL)
THEN CASE
WHEN ((lag(times) over(partition by esz order by ranks)) is NULL)
THEN (lead(times, 2) over(partition by esz order by ranks) - 2)
ELSE CASE
WHEN((lag(times) over(partition by esz order by ranks) + 1) <
(lead(times, 2) over(partition by esz order by ranks) - 2))
THEN (lag(times) over(partition by esz order by ranks) + 1)
ELSE (lead(times, 2) over(partition by esz order by ranks) - 2)
END
END
-- NOT NULL || NULL* || ANY
WHEN(times is NULL AND
(lag(times) over(partition by esz order by ranks)) is not NULL)
THEN ((lag(times) over(partition by esz order by ranks)) + 1)
-- NULL || NULL* || NOT NULL
WHEN(times is NULL AND
(lag(times) over(partition by esz order by ranks)) is NULL AND
(lead(times) over(partition by esz order by ranks)) is not NULL)
THEN ((lead(times) over(partition by esz order by ranks)) - 1)
ELSE -999
END as time_output
from T1
Run the Query: db<>fiddle
This solution work for two null
scenario top or bottom or both, as follows:
value | null | *(Current Position = NULL) | null | value
value | null | *(Current Position = NULL) | anything
anything | *(Current Position = NULL) | null | value
So, go to db<>fiddle page and there I explain all conditions one by one.
Please add the Case-03
(null-CurrentPos-null
) scenario if required.
I think you understand my approach. So, design your query according to that.