sqlsql-serverwindow-functions

Select first column for multiple occurrences of second column and a condition from third column


I have a table which looks like this

Individual_ID   Household_ID    AccountType
-------------------------------------------
101             HH_101          CHK
102             HH_101          SAV
103             HH_101          CHK
104             HH_102          CHK
105             HH_103          SAV
106             HH_104          SAV
107             HH_104          CHK
108             HH_105          SAV
109             HH_105          SAV
110             HH_106          CHK
111             HH_106          SAV

I want to select all the Individual_ID's which have the same Household_ID and a Checking Account (CHK = Checking and SAV = Savings).

I am trying the below query

WITH cte AS
(
    SELECT 
        Individual_ID, Household_ID, AccountType,
        ROW_NUMBER() OVER (PARTITION BY Household_ID ORDER BY Individual_ID ASC) AS Acct_Per_Household
    FROM
        [Test].[dbo].[AccountsInfo]
)
SELECT * 
FROM cte 
WHERE Acct_Per_Household > 1 AND AccountType = 'CHK'

But the above query is excluding Individual_ID = 101 because of the way it is written. So what I am trying to achieve is if a Household_ID is occurring more than once, then select all Individual_ID who have a checking account(CHK). So the output should look like below.

Individual_ID   Household_ID    AccountType
-------------------------------------------
101             HH_101          CHK
103             HH_101          CHK
107             HH_104          CHK
110             HH_106          CHK

I know this can be done through windowing functions, but I am not sure about which function to use to achieve the goal.


Solution

  • Try this:

    
    
    WITH cte AS 
    (
      SELECT 
          Household_ID as hid, count(Individual_ID) as num_indivs
      FROM AccountsInfo
      WHERE 
       AccountType = 'CHK'
      GROUP BY Household_ID
      HAVING count(Individual_ID) > 1
    )
    SELECT Individual_ID, HouseHold_ID, AccountType
    FROM
       AccountsInfo ai INNER JOIN cte ON ai.Household_ID = cte.hid
    WHERE 
       AccountType = 'CHK'
    ORDER BY 
       Individual_ID
    

    Basically the cte figures out which households have more than one individual per household that have checking accounts.

    The JOIN gets you the details about those households.

    You can probably use an IN with a subquery as well, but a JOIN is often considered faster. I would profile both and see which performs better for your use case.

    Your use of ROW_ID and PARTITION may lead to a solution, but I think this more classic solution may be easier to understand.