sql-servernullgaps-and-islandswonderware

Replacing null values with the previous non-null value in each group


I'm connecting to Microsoft SQL Server on Tableau through a custom SQL query. I have a table with 3 fields DateTime, TagName, Value, and I want to replace null values in the Value field by the last (respecting the DateTime value) non-null value in each group of TagName.

|---------------------|------------------|-----------------|
|     DateTime        |     TagName      |      Value
|---------------------|------------------|-----------------
|  15.04.2019 16:51:30|         A        |       10
|---------------------|------------------|----------------- 
|  15.04.2019 16:52:42|         A        |       NULL
|---------------------|------------------|----------------- 
|  15.04.2019 16:53:14|         A        |       NULL
|---------------------|------------------|----------------- 
|  15.04.2019 17:52:14|         A        |       15
|---------------------|------------------|----------------- 
|  15.04.2019 16:51:30|         B        |       NULL
|---------------------|------------------|----------------- 
|  15.04.2019 16:52:42|         B        |       NULL
|---------------------|------------------|-----------------
|  15.04.2019 16:53:14|         B        |       NULL
|---------------------|------------------|----------------- 
|  15.04.2019 17:52:14|         B        |       15
|---------------------|------------------|-----------------|

The new table should look like this:

|---------------------|------------------|-----------------|
|     DateTime        |     Computer     |      Value
|---------------------|------------------|-----------------
|  15.04.2019 16:51:30|         A        |       10
|---------------------|------------------|----------------- 
|  15.04.2019 16:52:42|         A        |       10
|---------------------|------------------|----------------- 
|  15.04.2019 16:53:14|         A        |       10
|---------------------|------------------|----------------- 
|  15.04.2019 17:52:14|         A        |       15
|---------------------|------------------|----------------- 
|  15.04.2019 16:51:30|         B        |       0
|---------------------|------------------|----------------- 
|  15.04.2019 16:52:42|         B        |       0
|---------------------|------------------|-----------------
|  15.04.2019 16:53:14|         B        |       0
|---------------------|------------------|----------------- 
|  15.04.2019 17:52:14|         B        |       15
|---------------------|------------------|-----------------|

This is already what I've tried, but it replaces NULL values without considering the TagNames values (In this example there is only one TagName).

SELECT  Computer, DateTime
,       CASE 
        WHEN Value IS NULL 
        THEN                                
       (SELECT TOP 1 Value 
        FROM History 
        WHERE DateTime<T.DateTime 
              AND TagName='RM02EL00CPT81.rEp'
              AND DateTime >='2018-12-31 23:59:00' 
              AND wwRetrievalMode='Delta'
              AND Value IS NOT NULL ORDER BY DateTime DESC
       ) 
        ELSE Value 
        END 
        AS ValueNEW
FROM History T
WHERE  TagName='RM02EL00CPT81.rEp' AND DateTime >='2018-12-31 23:59:00' AND wwRetrievalMode='Delta'

I wanted to do almost the same thing by adding OVER(PARTITION BY TagName), but it threw an error. (This is because it doesn't work with SELECT TOP 1.)


Solution

  • This is a "classic" Gaps and Islands question. You can achieve this without a 2 scans, or a triangular join by using the window functions:

    WITH VTE AS(
        SELECT CONVERT(datetime, [DateTime],104) AS [DateTime],
               TagName,
               [Value]
        FROM (VALUES ('15.04.2019 16:51:30','A',10  ),
                     ('15.04.2019 16:52:42','A',NULL),
                     ('15.04.2019 16:53:14','A',NULL),
                     ('15.04.2019 17:52:14','A',15  ),
                     ('15.04.2019 16:51:30','B',NULL),
                     ('15.04.2019 16:52:42','B',NULL),
                     ('15.04.2019 16:53:14','B',NULL),
                     ('15.04.2019 17:52:14','B',15  )) V([DateTime],TagName,[Value])),
    Grps AS(
        SELECT [DateTime],
               TagName,
               [Value],
               COUNT(CASE WHEN [Value] IS NOT NULL THEN 1 END) OVER (PARTITION BY TagName ORDER BY [DateTime]
                                                                     ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
        FROM VTE)
    SELECT DateTime,
           TagName,
           ISNULL(MAX([Value]) OVER (PARTITION BY TagName, Grp),0) AS [Value]
    FROM Grps
    ORDER BY TagName, [DateTime]