In my searching for answers, I seem to only be finding explanations that cover the existence of NULL
which is why the NOT IN
returns 0 results. However, my scenario is exactly the opposite. I'm getting my expected results with the NOT IN
and my NOT EXISTS
is giving me 0. And to clarify, I have no NULLs in my sub-query. Here is my query:
DECLARE @EndDate DATE= CAST(CONCAT(YEAR(GETDATE()), '-', MONTH(GETDATE()), '-01') AS DATE) --First day of this month
DECLARE @StartDate DATE= DATEADD(month, -12, @EndDate) --12 months prior
SELECT Deactivated = COUNT(DISTINCT o.ClinicLocationId)
FROM [order].package p WITH(NOLOCK)
INNER JOIN [order].[order] o WITH(NOLOCK) ON o.packageid = p.packageid
INNER JOIN profile.ClinicLocationInfo cli WITH(NOLOCK) ON cli.LocationId = o.ClinicLocationId
AND cli.FacilityType IN('CLINIC', 'HOSPITAL')
WHERE CAST(p.ShipDTM AS DATE) >= dateadd(month,-1,@StartDate)
AND CAST(p.ShipDTM AS DATE) < dateadd(month,-1,@EndDate)
AND p.isshipped = 1
AND o.IsShipped = 1
AND ISNULL(o.iscanceled, 0) = 0
and not exists (
--and o.ClinicLocationId not in (
SELECT DISTINCT o.ClinicLocationId
FROM [order].package p WITH(NOLOCK)
INNER JOIN [order].[order] o WITH(NOLOCK) ON o.packageid = p.packageid
INNER JOIN profile.ClinicLocationInfo cli WITH(NOLOCK) ON cli.LocationId = o.ClinicLocationId
AND cli.FacilityType IN('CLINIC', 'HOSPITAL')
WHERE CAST(p.ShipDTM AS DATE) >= @StartDate
AND CAST(p.ShipDTM AS DATE) < dateadd(day,-1,@EndDate)
AND p.isshipped = 1
AND o.IsShipped = 1
AND ISNULL(o.iscanceled, 0) = 0
)
For a high level overview, I'm basically trying to find the number of ID's that exist in one set that don't in the next (separated by a 12 month rolling window, offset by 1 month). But for the sake of simplicity, I've written the below that very simply illustrates the exact same symptom:
drop table if exists #T1, #T2
create table #T1 (id int)
create table #T2 (id int)
insert into #T1 (id)
values
(3),
(8)
insert into #T2 (id)
values
(671),
(171)
select id from #T1 where id not in (select id from #T2)
select id from #T1 where not exists (select id from #T2)
My expectation is that both of these would yield the same results, the contents of #T1
(3,8) but instead, I only get those results in the second query by eliminating the NOT
. I would assume I'm suffering from a fundamental misunderstanding of how the EXISTS
operator works, as up until now I assumed there was no real difference aside from how the scanning occurred and NULL handling.
Where am I going wrong with my expectation?
The query shape...
and o.ClinicLocationId not in (SELECT o.ClinicLocationId ...)
...correlates o.ClinicLocationId
to o.ClinicLocationId
in the subquery.
When using exists
you have to write a correlated subquery to get the same effect:
and not exists (SELECT o1.ClinicLocationId ...
AND o1.ClinicLocationId = o.ClinicLocationId)
Note that the second query requires a different alias in the subquery.