I’m working with time-series data in SQL Server and need to retrieve the last valid value for each day. A valid value is defined as one that is non-null and not zero.
The challenge is that data points may continue to be recorded after the last meaningful value of the day — such as system-generated filler values (e.g., 0 or non-null placeholders). For example, if the last real value on a particular day is at 9:46 PM, followed by entries with 0 or null values, I want the 9:46 PM value to be selected as the final data point for that date.
My goal is to write a query that returns exactly one row per day, capturing only the final non-filler value, even if invalid entries follow. I need to apply this across an entire month or year, depending on the input date range.
CREATE OR ALTER PROCEDURE dbo.Weekly_CSS1_LT_Coasting
AS
BEGIN
SET NOCOUNT ON;
-- Define current month boundaries
DECLARE @StartDate DATE = DATEFROMPARTS(YEAR(GETDATE()), MONTH(GETDATE()), 1);
DECLARE @EndDate DATE = DATEADD(MONTH, 1, @StartDate);
DECLARE @DummyDate DATE = DATEADD(DAY, -1, @StartDate); -- Last day of previous month (optional row)
-- Generate all dates in current month
;WITH DateList AS (
SELECT @StartDate AS [Date]
UNION ALL
SELECT DATEADD(DAY, 1, [Date])
FROM DateList
WHERE DATEADD(DAY, 1, [Date]) < @EndDate
),
-- Get last NON-NULL value per day
ValidHistory AS (
SELECT
CAST(H.DateTime AS DATE) AS [Date],
H.Value,
H.DateTime,
ROW_NUMBER() OVER (
PARTITION BY CAST(H.DateTime AS DATE)
ORDER BY
CASE WHEN H.Value IS NOT NULL THEN 0 ELSE 1 END,
H.DateTime DESC
) AS rn
FROM Runtime.dbo.History H
WHERE
H.TagName = 'CSS1_Streetlight.LT_KWH_PerDay_Costing'
AND H.wwRetrievalMode = 'Cyclic'
AND H.wwCycleCount = 100
AND H.wwQualityRule = 'Extended'
AND H.wwVersion = 'Latest'
AND H.DateTime >= @StartDate
AND H.DateTime < @EndDate
),
LastValidValue AS (
SELECT [Date], Value
FROM ValidHistory
WHERE rn = 1
)
-- Final SELECT: Join directly without extra CTE
SELECT @DummyDate AS [Date], 0 AS TotalCombinedValue
UNION ALL
SELECT
D.[Date],
ISNULL(L.Value, 0) AS TotalCombinedValue
FROM DateList D
LEFT JOIN LastValidValue L ON D.[Date] = L.[Date]
ORDER BY [Date]
OPTION (MAXRECURSION 0);
END;
You could use the LAST_VALUE()
window function:
LAST_VALUE(…) OVER(PARTITION BY <your period of reference>)
,
or FIRST_VALUE()
after inverting the order (as ORDER BY
window functions go from start to "the current row", their LAST_VALUE()
would be that of the current row, unless typing a long, long ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
. Hey, I said I didn't want to type it!).
As for the 0s and nulls: we could use FIRST_VALUE(NULLIF(val, 0)) IGNORE NULLS
, but to get past the initial NULL
s anyway we have to extend the window with UNBOUNDED FOLLOWING
;
or use an ORDER
trick to get them to the end (but if we put them at the end we don't need the IGNORE NULLS
).
But in fact the query you wrote in the question nearly works, with the "ORDER
trick" method:
you just have to change your CASE WHEN H.Value IS NOT NULL
to CASE WHEN H.Value <> 0
(which implicitely discards NULL
s too).
Then you can simplify by using DISTINCT FIRST_VALUE()
instead of ROW_NUMBER()
+ filtering on rn = 1
.
(I've put your fixed query as the forelast one of this fiddle, just under the "Just changing to Value <> 0" comment)