I have this table
MachineCode | DoCreation | MachineStateCode |
---|---|---|
DM139 | 2024-04-03 00:32:17.377 | 99 |
DM139 | 2024-04-03 00:32:49.080 | 0 |
DM139 | 2024-04-03 01:51:14.427 | 99 |
DM139 | 2024-04-03 01:51:45.643 | 0 |
DM139 | 2024-04-03 06:07:38.357 | 3 |
DM139 | 2024-04-03 06:07:39.043 | 0 |
DM139 | 2024-04-03 06:23:37.433 | 100 |
DM139 | 2024-04-03 06:23:43.697 | 0 |
DM139 | 2024-04-03 06:24:46.153 | 1 |
DM139 | 2024-04-03 06:25:02.467 | 0 |
DM139 | 2024-04-03 06:46:50.410 | 100 |
DM139 | 2024-04-03 06:47:29.247 | 0 |
DM139 | 2024-04-03 06:47:46.397 | 100 |
DM139 | 2024-04-03 06:53:47.160 | 0 |
DM139 | 2024-04-03 06:59:41.633 | 1 |
DM139 | 2024-04-03 06:59:54.550 | 0 |
DM139 | 2024-04-03 07:00:03.203 | 1 |
DM139 | 2024-04-03 07:00:06.437 | 0 |
DM139 | 2024-04-03 07:00:14.247 | 1 |
DM139 | 2024-04-03 07:00:17.683 | 2 |
DM139 | 2024-04-03 07:00:18.153 | 1 |
DM139 | 2024-04-03 07:00:18.840 | 2 |
DM139 | 2024-04-03 07:00:26.860 | 1 |
DM139 | 2024-04-03 07:00:46.387 | 2 |
each records rapresent a "change of state" of the specific machine DM139 where
I need to create a new column, let's call it MachineStateCodeAdjusted where its value is equal to 3 if the previous MachineStateCode is 3 and the current MachineStateCode is not 2, otherwise it should have the current MachineStateCode value. (the next table is what I need)
MachineCode | DoCreation | MachineStateCode | MachineStateCodeAdjusted |
---|---|---|---|
DM139 | 2024-04-03 00:32:17.377 | 99 | 99 |
DM139 | 2024-04-03 00:32:49.080 | 0 | 0 |
DM139 | 2024-04-03 01:51:14.427 | 99 | 99 |
DM139 | 2024-04-03 01:51:45.643 | 0 | 0 |
DM139 | 2024-04-03 06:07:38.357 | 3 | 3 |
DM139 | 2024-04-03 06:07:39.043 | 0 | 3 |
DM139 | 2024-04-03 06:23:37.433 | 100 | 3 |
DM139 | 2024-04-03 06:23:43.697 | 0 | 3 |
DM139 | 2024-04-03 06:24:46.153 | 1 | 3 |
DM139 | 2024-04-03 06:25:02.467 | 0 | 3 |
DM139 | 2024-04-03 06:46:50.410 | 100 | 3 |
DM139 | 2024-04-03 06:47:29.247 | 0 | 3 |
DM139 | 2024-04-03 06:47:46.397 | 100 | 3 |
DM139 | 2024-04-03 06:53:47.160 | 0 | 3 |
DM139 | 2024-04-03 06:59:41.633 | 1 | 3 |
DM139 | 2024-04-03 06:59:54.550 | 0 | 3 |
DM139 | 2024-04-03 07:00:03.203 | 1 | 3 |
DM139 | 2024-04-03 07:00:06.437 | 0 | 3 |
DM139 | 2024-04-03 07:00:14.247 | 1 | 3 |
DM139 | 2024-04-03 07:00:17.683 | 2 | 2 |
DM139 | 2024-04-03 07:00:18.153 | 1 | 1 |
DM139 | 2024-04-03 07:00:18.840 | 2 | 2 |
DM139 | 2024-04-03 07:00:26.860 | 1 | 1 |
DM139 | 2024-04-03 07:00:46.387 | 2 | 2 |
I used the LAG function to get the previous MachineStateCode (and it works) but to work as I want I need to get the previous MachineStateCodeAdjusted, that is not accessible for the LAG function.
The SQL query I'm executing is this:
SELECT MachineCode
, DoCreation
, MachineStateCode
, IIF(LAG(MachineStateCode, 1, MachineStateCode) OVER (ORDER BY DoCreation ASC) = 3
AND MachineStateCode<>2, 3, MachineStateCode) AS MachineStateCodeAdjusted
FROM mch.MachineStateChanges
and the result I get clearly show me the it not works as I want
MachineCode | DoCreation | MachineStateCode | MachineStateCodeAdjusted |
---|---|---|---|
DM139 | 2024-04-03 00:32:17.377 | 99 | 99 |
DM139 | 2024-04-03 00:32:49.080 | 0 | 0 |
DM139 | 2024-04-03 01:51:14.427 | 99 | 99 |
DM139 | 2024-04-03 01:51:45.643 | 0 | 0 |
DM139 | 2024-04-03 06:07:38.357 | 3 | 3 |
DM139 | 2024-04-03 06:07:39.043 | 0 | 3 |
DM139 | 2024-04-03 06:23:37.433 | 100 | 100 WRONG |
DM139 | 2024-04-03 06:23:43.697 | 0 | 0 WRONG |
DM139 | 2024-04-03 06:24:46.153 | 1 | 1 WRONG |
I also tried with a recursive CTE and it works but it is very slow! The query extract in 4 minutes 2 days of work, while I need to extract 1 year!
What is the best (and fast) approach to fix it?
I think your explanation is incomplete. You should have indicated that once you get an "error" (3) then all states do not matter, until you get a "working" (2) again.
In this case, this is a Gaps & Islands problem. You can use the traditional solution:
select z.*,
case when f = 3 or (
lag(MachineStateCode)
over(partition by MachineCode order by DoCreation) = 3
and MachineStateCode <> 2
)
then 3
else MachineStateCode
end as MachineStateCodeAdjusted
from (
select y.*,
first_value(MachineStateCode) over(partition by g order by DoCreation) as f
from (
select x.*, sum(i) over(partition by MachineCode order by DoCreation) as g
from (
select t.*, case when MachineStateCode in (2, 3) then 1 else 0 end as i
from MachineStateChanges t
) x
) y
) z
Result:
MachineCode DoCreation MachineStateCode i g f MachineStateCodeAdjusted
------------ ---------------------------- ----------------- -- -- --- ------------------------
DM139 2024-04-03 00:32:17.3770000 99 0 0 99 99
DM139 2024-04-03 00:32:49.0800000 0 0 0 99 0
DM139 2024-04-03 01:51:14.4270000 99 0 0 99 99
DM139 2024-04-03 01:51:45.6430000 0 0 0 99 0
DM139 2024-04-03 06:07:38.3570000 3 1 1 3 3
DM139 2024-04-03 06:07:39.0430000 0 0 1 3 3
DM139 2024-04-03 06:23:37.4330000 100 0 1 3 3
DM139 2024-04-03 06:23:43.6970000 0 0 1 3 3
DM139 2024-04-03 06:24:46.1530000 1 0 1 3 3
DM139 2024-04-03 06:25:02.4670000 0 0 1 3 3
DM139 2024-04-03 06:46:50.4100000 100 0 1 3 3
DM139 2024-04-03 06:47:29.2470000 0 0 1 3 3
DM139 2024-04-03 06:47:46.3970000 100 0 1 3 3
DM139 2024-04-03 06:53:47.1600000 0 0 1 3 3
DM139 2024-04-03 06:59:41.6330000 1 0 1 3 3
DM139 2024-04-03 06:59:54.5500000 0 0 1 3 3
DM139 2024-04-03 07:00:03.2030000 1 0 1 3 3
DM139 2024-04-03 07:00:06.4370000 0 0 1 3 3
DM139 2024-04-03 07:00:14.2470000 1 0 1 3 3
DM139 2024-04-03 07:00:17.6830000 2 1 2 2 2
DM139 2024-04-03 07:00:18.1530000 1 0 2 2 1
DM139 2024-04-03 07:00:18.8400000 2 1 3 2 2
DM139 2024-04-03 07:00:26.8600000 1 0 3 2 1
DM139 2024-04-03 07:00:46.3870000 2 1 4 2 2
See running example at db<>fiddle.