sqlsql-server

Clean up a SQL Server Query with too many DISTINCT clauses in it


In SQL Server, I have a join table Visit between the Customer and Store tables.

I am trying to find:

I think that the following query is correct - in that it appears to return the correct result:

SELECT DISTINCT 
    v.CustomerID
FROM
    Visit v
INNER JOIN 
    (SELECT CustomerID 
     FROM 
         (SELECT DISTINCT CustomerID, StoreID  
          FROM Visit) temp
     GROUP BY CustomerID
     HAVING COUNT(*) = 1) c ON v.CustomerID = c.CustomerID
WHERE 
    v.StoreID IN (1, 10, 16, 42)

However I want know if there is any way to simplify/clean it up as there are multiple DISTINCT and GROUP BY clauses in it.

Note: It's not ideal, but if necessary I can live with a query that returns customers who have visited any of desired Stores (1, 10, 16, 42) in any combination, any number of times as long as they haven't visited a store not in that list.


Solution

  • Something like this perhaps:

    SELECT v.CustomerID, MAX(v.StoreID) AS StoreID
    FROM Visit v
    GROUP BY CustomerID
    HAVING MAX(StoreId) = MIN(StoreId) -- Alternatively COUNT(DISTINCT v.StoreID) = 1
    AND MAX(v.StoreID) IN (1, 10, 16, 42)
    

    COUNT(DISTINCT StoreID) gives you customers with only one unique store

    Note: You can also use: MAX(StoreId) = MIN(StoreId) instead of COUNT(DISTINCT) because it has in some cases better performance characteristics.

    To get which store it is, I use MAX(x.StoreID). MAX or MIN doesn't matter here, as long as it's some kind of aggregation.