I am trying to come up with T-SQL query to delete data based on these conditions:
Find data that have same Email.
If that user has both AllLocations 'Yes' or 'No', delete entry/entries that are not today's (assuming today is 4/9/2025).
UserName Email AllLocations Location DateLoad
John Doe jdoe@gmail.com No Cerritos 4/9/2025
John Doe jdoe@gmail.com No Cerritos 4/9/2025
John Doe jdoe@gmail.com Yes 2/21/2025
Jane Sue jsue@yahoo.com No Los Angeles 3/31/2025
So, expected outcome is bottom (after deletion):
UserName Email AllLocations Location DateLoad
John Doe jdoe@gmail.com No Cerritos 4/9/2025
John Doe jdoe@gmail.com No Cerritos 4/9/2025
Jane Sue jsue@yahoo.com No Los Angeles 3/31/2025
I tried something like bottom, but obviously, it is wrong:
delete
from [table] a
inner join [table] b
on a.Email = b.Email
where
(
a.AllLocations = 'Yes'
and
b.AllLocations = 'No'
)
or
(
b.AllLocations = 'Yes'
and
a.AllLocations = 'No'
)
and
(
cast( a.DateLoad as date) = cast( GETDATE() as date)
or
cast( b.DateLoad as date) = cast( GETDATE() as date)
)
for your condition Find data that have same Email
, you can use count() with window function
count(*) over (partition by Email)
For condition : If that user has both AllLocations 'Yes' or 'No'
Again use window function to get min() and max()
value and compare
min(AllLocations) over (partition by Email)
<>
max(AllLocations) over (partition by Email)
And your last condition is simply check for DateLoad
with
cte as
(
select *,
email_cnt = count(*) over (partition by Email),
both_location = case when min(AllLocations) over (partition by Email)
<> max(AllLocations) over (partition by Email)
then 1
else 0
end
from [table]
)
delete c
from cte c
where email_cnt > 1
and both_location = 1
and DateLoad <> '20250409'