sqlsql-server

Calculate Mashed Potatoes Percentage


I have to write a report that shows:

This is my query:

SELECT 
    [Location],
    [Type] AS [Varietal],
    COUNT([MugId]) AS [Mug Count],
    SUM([Weight]) AS [Pounds],
    (SELECT SUM([Weight]) 
     FROM SpudIndex 
     GROUP BY [Type]) / (SELECT SUM([Weight]) 
                         FROM SpudIndex 
                         GROUP BY [Location]) * 100 AS [Percent]
FROM
    SpudIndex
GROUP BY 
    [Location], [Type]

I get the following error:

Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.

I know that it's an issue with the percentage column, but I'm having trouble wrapping my head around this problem. Please advise.


Solution

  • Explanation

    Technically

    You are asking SQL, for each unique [Location],[Type] pair, to append a column [Percent] computed as "every distinct [Type]'s [Weight] divided by every distinct [Location]'s [Weight]".
    Thus you would get as many values as you have different types and locations,
    which for SQL is impossible to put in 1 column that holds only 1 value (we're speaking of a scalar type column).

    Functionally

    Thus first you have to redefine what you expect in this column.

    I would say that your subquery tries to get the weight for each type divided by the weight for each location,
    while you want the weight for the current type and location divided by the weight for the current location.

    Solving

    By adapting your query

    Mapping this newly expressed specification to pseudo-code would give something like:

    (sum(weight) where location = thisresultrow.location and type = thisresultrow.type)
    /
    (sum(weight) where location = thisresultrow.location)
    

    But the first part does not need a sub-SELECT, as it exactly matches what you are computing in your main SELECT (Sum([Weight]) AS [Pounds] GROUP BY [Location], [Type]), so you can replace it.

    Thus your query becomes (after improving readability with an lt alias for the GROUP BY Location,Type, and an l alias for the GROUP BY Location):

    SELECT [Location],
    [Type] AS [Varietal],
    Count([MugId]) AS [Mug Count],
    Sum([Weight]) AS [Pounds]
    , Sum([Weight]) / (SELECT Sum([Weight]) FROM SpudIndex l WHERE l.[Location] = lt.[Location]) * 100 AS [Percent]
    from SpudIndex lt
    GROUP BY [Location],[Type];
    
    With window functions

    But you can use window functions which are done exactly for that: comparing or handling each row of the resultset with other rows that share one or multiple common properties.

    Here you would like to compare one type and location pair with all the mugs with the same location (so there is our common property: the location).

    And to help us further, we will split your query into two steps with Common Table Expressions, which allow to first compute an intermediate resultset (here the Sum(Weight) for each individual type and location pair), then use that resultset in a second, final query (here use the Pounds and compare them with the window function).

    WITH raw AS
    (
      SELECT [Location],
      [Type] AS [Varietal],
      Count([MugId]) AS [Mug Count],
      Sum([Weight]) AS [Pounds]
      from SpudIndex lt
      GROUP BY [Location],[Type]
    )
    SELECT
      *,
      [Pounds] / Sum([Pounds]) OVER (PARTITION BY [Location]) * 100 AS [Percent]
    FROM raw;
    

    So the CTE (WITH raw AS (…)) returns each [Location],[Type] independently of others,
    then in the final query we divide [Pounds] (of each row from the raw CTE) by the "sum of [Pounds] of all raw rows with the same [Location]".

    (I made a small fiddle with fictive data to illustrate)