sqlsql-server

How to find not exist data between two times in SQL


I have two date parameter.

@startTime = '2020-08-19 14:00'
@endTime = '2020-08-19 21:00'

My table is

table_test|  id  |    time        |  value  | device_id
         1|  11  |2020-08-19 14:00|   45    |     1
         2|  12  |2020-08-19 16:00|   85    |     1
         3|  13  |2020-08-19 17:00|   14    |     1
         4|  14  |2020-08-19 18:00|   28    |     1
         5|  15  |2020-08-19 20:00|   72    |     1
         6|  16  |2020-08-19 21:00|   02    |     1

I want to find data that is not in two given time intervals but also to select any column. For example, my output shoul be

         *|  device_id |    time        |
         1|     1      |2020-08-19 15:00|
         2|     1      |2020-08-19 19:00|

 

How I do ?

First of all, I tried internal table, I found which device_id like that

SELECT device_id
FROM table_test
WHERE time between @startTime and @endTime 
GROUP BY device_id
HAVING COUNT(*) < 8

And, in cursor insert internal table

OPEN contact_cursor;  
FETCH NEXT FROM contact_cursor INTO
@deviceId;
WHILE @@FETCH_STATUS = 0  
BEGIN -- BEGIN - contact_cursor

    WHILE(@currentTime <= @endTime)
    BEGIN -- WHILE
        INSERT INTO @current_table VALUES (@deviceId,@currentTime)
        SET @currentTime = dateadd(HOUR,1 ,@currentTime); 
    END -- WHILE
    SET @currentTime = @startTime; 

FETCH NEXT FROM contact_cursor INTO
@deviceId;  
END -- END - contact_cursor 
  
CLOSE contact_cursor;  
DEALLOCATE contact_cursor;  

After that found not exist value

SELECT cur.device_id cur.measurement_time FROM @current_table AS cur 
LEFT JOIN table_test AS t ON t.device_id = cur.device_id and t.time = cur.time 
WHERE t.time is null 

But these are very costly for long periods of time. What can I do ?


Solution

  • Rather than a recursive solution, use a tally. Tally Tables are significantly faster than both a CURSOR and a recursive Common Table Expression (see my post Speed Test: rCTE vs Tally for more on the subject). The reason being is that a Tally uses a set-based solution, something that RDBMS excel at. Conversely, RDBMS are not well designed for iterative tasks, which both a CURSOR and rCTE are.

    I build one inline here, however, you can create and store them locally for easy reuse. I also assume that you want this for multiple devices, so I have included a CTe which combines the times and device IDs.

    DECLARE @StartTime smalldatetime,
            @EndTime smalldatetime;
    SET @startTime = '2020-08-19T14:00:00';
    SET @endTime = '2020-08-19T21:00:00';
    
    WITH N AS(
        SELECT N
        FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
    Tally AS(
        SELECT TOP(SELECT DATEDIFF(HOUR, @StartTime, @EndTime) + 1)
               ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
        FROM N N1, N N2, N N3, N N4, N N5), --100,000 rows, 11.5 years~
    Times AS(
        SELECT DATEADD(HOUR, T.I, @StartTime) AS T
        FROM Tally T),
    Devices AS(
        SELECT T.T
               D.DeviceID
        FROM Times T
             CROSS JOIN (SELECT DISTINCT DeviceID FROM dbo.YourTable) D)
    SELECT D.DeviceID,
           D.T AS [Time]
    FROM Devices D
         LEFT JOIN dbo.YourTable YT ON D.T = YT.[Time]
                                   AND D.DeviceID = YT.DeviceID
    WHERE YT.id IS NULL;