I have two tables to work from one has a list of dates and their corresponding week they belong, the other has the date a person has had any of a set of 8 tests taken (one row for each test). I want to be able to show the most recent date each test has been taken for each week of the year, regardless of when the test was taken. This is an example of the output I'm trying to achieve:
| Weekending | Personkey | Test 1 | Test 2 | |:-----------|:---------:|:----------:|-----------:| | 2019-01-06 | 1 | 2019-01-04 | 2018-12-15 | | 2019-01-13 | 1 | 2019-01-04 | 2019-01-11 | | 2019-01-20 | 1 | 2019-01-18 | 2019-01-11 | ... | Weekending | Personkey | Test 1 | Test 2 | |:-----------|:---------:|:----------:|-----------:| | 2020-10-25 | 1 | 2019-01-18 | 2019-01-11 | | 2020-11-01 | 1 | 2020-10-30 | 2019-01-11 | | 2020-11-07 | 1 | 2020-10-30 | 2019-01-11 |
So far I've been able (I think) to get whether there was a test in that week for each person, for each week.
| Weekending | Personkey | Test 1 | Test 2 | |:-----------|:---------:|:----------:|-----------:| | 2019-01-06 | 1 | 2019-01-04 | null | | 2019-01-13 | 1 | null | 2019-01-11 | | 2019-01-20 | 1 | 2019-01-18 | null | ... | Weekending | Personkey | Test 1 | Test 2 | |:-----------|:---------:|:----------:|-----------:| | 2020-10-25 | 1 | null | null | | 2020-11-01 | 1 | 2020-10-30 | null | | 2020-11-07 | 1 | null | null |
I've the following query to get this far.
with wkref as (
Select distinct
d.[DateKey]
, d.FirstDayOfWeek
from Dates_table d with(nolock)
where d.CalendarYear between 2018 and YEAR(getdate())
)
, checks as (
Select
Dateadd(d, 6, w.FirstDayOfWeek) 'WeekEnding'
, t.PersonKey
, MAX(case
when t.Measurement = 'Test1' then t.EventDateKey
else null
end) 'Test1_Date'
, MAX(case
when t.Measurement = 'Test2' then t.EventDateKey
else null
end) 'Test2_Date'
from wkref w with(nolock)
left join Tests_table t with(nolock)
on t.EventDateKey = w.DateKey
)
I've tried calculating the number of nulls between entries and statement using LAG where the number of null entries is the number of rows to lag back to.
Select
c.WeekEnding
, c.PersonKey
, c.partn
, c.test1_Date
, LAG(c.test1_date,partn-1,c.test1_Date) over(order by weekending) 'LatestTest1'
from (
Select
c.WeekEnding
, c.PersonKey
, c.Test1_Date
, ROW_NUMBER() over(partition by c.personkey, c.test1_date order by c.weekending asc) 'partn'
from checks c
) c
Although this hasn't worked. My use of ROW_NUMBER() isn't bringing back the number of rows between non-null values, just the total number of non-null values. It then isn't filling in all the non-null rows, just rows where there's already a value - so I know I'm still pretty far from the right answer.
I've tried simpler options like self-joins and joins based on testdate <= weekending and I don't think those have worked. Particularly the solution in here: Fetch the rows which have the Max value for a column for each distinct value of another column
So my questions are:
I've tried to set up a live example in SQLFiddle, as this has gotten quite long and complex, but that didn't go very well either. This is the first time I haven't been able to Google myself to an answer, and I've been at it all day. Please help!!
(edited for table formatting, which still doesn't seem to be working...)
I have answered my own question. Posting this for posterity, if I get myself in this hole again. The linked post included most of the answer. Left join the tests table into the dates table, so that the dates table is duplicated for every test that meets the condition. Below I've put tests into a CTE to only pull through the tests I'm interested in.
with tests_cte as (
t.id
, t.eventdate
from tests_table t
where t.testtype = 'test in question'
and
t.eventdate between 'desired start' and 'desired end'
)
Select
d.FirstDayofWeek
, count(distinct t1.id) 'People'
from dates_table d
left join tests_cte t
on t.eventdate between dateadd(d,-366,d.firstdayofweek) and d.firstdayofweek
where d.firstdayofweek between 'desired start date' and 'desired end date'
group by d.firstdayofweek