I have a list of timestamps for data acquisition. timestamps close together belong to a cycle. I want to enumerate these cycles. So whenever between two timestamps is more then 100 seconds then it will create the next cycle.
CREATE TABLE [Cycles](
[Cycle] [int] NOT NULL,
[CycleStart] [datetime] NOT NULL,
[CycleEnd] [datetime] NOT NULL,
CONSTRAINT [PK_Cycles] PRIMARY KEY CLUSTERED
(
[Cycle] DESC
))
INSERT INTO [Cycles] VALUES
(10,'2023-12-04T9:00:00','2023-12-04T10:00:00'),
(11,'2023-12-04T21:00:00','2023-12-04T22:00:00'),
(12,'2023-12-04T23:00:00','2023-12-05T00:00:00')
CREATE TABLE [Data](
[datatimestamp] [datetime] NOT NULL,
CONSTRAINT [PK_Data] PRIMARY KEY NONCLUSTERED
(
[datatimestamp] ASC
))
INSERT INTO [Data] VALUES
('2023-12-05T00:05:20'),
('2023-12-05T00:05:21'),
('2023-12-05T00:05:22'),
('2023-12-05T00:10:01'),
('2023-12-05T00:10:02'),
('2023-12-05T00:10:03')
So I would need to add to Cycles
13
and 14
Here is what I could do as a Select:
DECLARE @lastCycle int = (SELECT TOP 1 Cycle FROM Cycles ORDER BY Cycle DESC);
DECLARE @lastCycleEnd datetime = (SELECT TOP 1 CycleEnd FROM Cycles ORDER BY Cycle DESC);
WITH marks AS (
SELECT datatimestamp,
CASE
WHEN DATEDIFF(Second, LAG(datatimestamp, 1, DATEADD(Second, -101, datatimestamp)) OVER (ORDER BY datatimestamp), datatimestamp) > 100
THEN 1 ELSE 0
END AS NextC
FROM [Data]
WHERE datatimestamp > @lastCycleEnd
)
SELECT @lastCycle + ROW_NUMBER() OVER (ORDER BY d.datatimestamp) AS Cycle, d.datatimestamp AS CycleBegin
FROM [Data] d
INNER JOIN marks m On m.datatimestamp = d.datatimestamp
WHERE m.NextC = 1
Which will return the new Cycles and their CycleStarts, as the result for the example data would look like this:
Cycle | CycleStart |
---|---|
13 | 2023-12-05 00:05:20 |
14 | 2023-12-05 00:10:01 |
How can I get the CycleEnd aswell as the third column?
Instead of filtering on NextC
as the final step, once you have your data from the marks
CTE. i.e.
datatimestamp | NextC |
---|---|
2023-12-05T00:05:20 | 1 |
2023-12-05T00:05:21 | 0 |
2023-12-05T00:05:22 | 0 |
2023-12-05T00:10:01 | 1 |
2023-12-05T00:10:02 | 0 |
2023-12-05T00:10:03 | 0 |
You need to then do SUM(NextC) OVER(ORDER BY datatimestamp)
, which will give you a value for each group by timestamps, i.e.
datatimestamp | Cycle |
---|---|
2023-12-05T00:05:20 | 1 |
2023-12-05T00:05:21 | 1 |
2023-12-05T00:05:22 | 1 |
2023-12-05T00:10:01 | 2 |
2023-12-05T00:10:02 | 2 |
2023-12-05T00:10:03 | 2 |
You can then group on this column and get the min and max datetimes to get your start/end. So your final query would be:
DECLARE @lastCycle int = (SELECT TOP 1 Cycle FROM Cycles ORDER BY Cycle DESC);
DECLARE @lastCycleEnd datetime = (SELECT TOP 1 CycleEnd FROM Cycles ORDER BY Cycle DESC);
WITH marks AS (
SELECT datatimestamp,
CASE
WHEN DATEDIFF(Second, LAG(datatimestamp, 1, DATEADD(Second, -101, datatimestamp)) OVER (ORDER BY datatimestamp), datatimestamp) > 100
THEN 1 ELSE 0
END AS NextC
FROM [Data]
WHERE datatimestamp > @lastCycleEnd
), marks2 AS (
SELECT m.DataTimeStamp, SUM(m.NextC) OVER (ORDER BY m.DataTimeStamp) AS Cycle
FROM marks AS m)
SELECT @lastCycle + ROW_NUMBER() OVER (ORDER BY m.Cycle) AS Cycle,
MIN(m.datatimestamp) AS CycleBegin ,
MAX(m.datatimestamp) AS CycleEnd
FROM marks2 m
GROUP BY m.Cycle;