sqlsql-servert-sqlaggregategrouping

Aggregate into custom groups


I'm not sure if this can be done in one stored procedure. I got it working using nested views but I want it to be as concise as possible.

Every time I try to select from a subquery, I get errors that I can't debug. I understand the concepts of how to get an output but struggle with the syntax.

Suppose we have a large table listing every household in the United States in the following format:

[dbo].[US_HOUSEHOLDS_TABLE]

HOUSEHOLD_ID FAMILY_NAME MEMBER_NAME
1 Anderson Robert
1 Anderson Mary
1 Anderson Cody
1 Anderson Brittany
2 Carver William
2 Carver Suzanne
3 Washington Fred
3 Washington Ethel
3 Washington Tony
4 Smith Darlene
4 Smith Amy
4 Smith Richard

Desired output:

FAMILY_SIZE FREQUENCY
1 23,903,124
2 57,116,548
3 38,617,295
4 12,397,106
5 3,108,284
6 or more 1,393,571

Solution

  • You can do something like this perhaps:

    SELECT  *
    INTO #data
    FROM    (
        VALUES  (1, N'Anderson', N'Robert')
        ,   (1, N'Anderson', N'Mary')
        ,   (1, N'Anderson', N'Cody')
        ,   (1, N'Anderson', N'Brittany')
        ,   (2, N'Carver', N'William')
        ,   (2, N'Carver', N'Suzanne')
        ,   (3, N'Washington', N'Fred')
        ,   (3, N'Washington', N'Ethel')
        ,   (3, N'Washington', N'Tony')
        ,   (4, N'Smith', N'Darlene')
        ,   (4, N'Smith', N'Amy')
        ,   (4, N'Smith', N'Richard')
        ,   (4, N'Smith', N'Darlene')
        ,   (4, N'Smith', N'Amy')
        ,   (4, N'Smith', N'Richard')
        ,   (5, 'Sigge', 'Mannen')
        ,   (5, 'Stack', 'Overflow')
        
    ) t (HOUSEHOLD_ID,FAMILY_NAME,MEMBER_NAME)
    
    SELECT  CASE WHEN cnt < 6 THEN CAST(cnt AS VARCHAR(30)) ELSE '6 or more' END AS HouseHold
    ,   COUNT(*) AS Total
    FROM    (
        SELECT  count(*) AS cnt
        FROM    #data
        GROUP BY HOUSEHOLD_ID
        ) x
    GROUP BY CASE WHEN cnt < 6 THEN CAST(cnt AS VARCHAR(30)) ELSE '6 or more' END
    

    First, you get the count by the household and then you group by that count.

    Those with 6+ get their own group.

    Outputs:

    HouseHold Total
    2 2
    3 1
    4 1
    6 or more 1