sqlt-sqlazure-sql-databaseaggregate-functions

Aggregating IDs within a PARTITION BY query


I've got this SELECT :

SELECT OS.returnmessageid,
       OS.tagconfig,
       CASE
         WHEN OP.messagetype = 202
              AND OS.datapointid IN ( 1, 3 ) THEN
         CONVERT(NVARCHAR(max), OS.lookupvalue)
         ELSE Try_convert(Nvarchar(max), OS.value)
       END AS Value,
       OS.unitnumber,
       OD.messageutc,
       OP.messagetype,
       OS.isproccessed,
       OS.id,
       Row_number()
         OVER (
           partition BY OS.returnmessageid, OS.unitnumber, OD.messageutc,
         OP.messagetype
           ORDER BY OS.returnmessageid, OS.unitnumber, OD.messageutc,
         OP.messagetype )
           AS RowNum
FROM   ooocommreturnmessagesscaling OS
       LEFT JOIN ooocommreturnmessagesparsing OP
              ON OP.id = OS.returnmessageid
       LEFT JOIN ooocommreturnmessagedetails OD
              ON OP.returnmessageid = OD.id

Example result set:

enter image description here

I'd like to add a column that has all the associated values for OS.Id for the same combination of OS.ReturnMessageId, OS.UnitNumber, OD.MessageUTC, OP.MessageType .

This is the output that I'm looking for:

enter image description here

As you can see that last field IdAggregate has been added as a column separated list of OS.Id for the same combination of OS.ReturnMessageId, OS.UnitNumber, OD.MessageUTC, OP.MessageType

How do I add this IdAggregate column?


Solution

  • My first thought was to use the STRING_AGG() function as in STRING_AGG(OS.id, ',') OVER(PARTITION BY OS.returnmessageid) WITHIN GROUP(ORDER BY OS.id) AS IdAggregate. Unfortunately, this is not supported and results in the message "The function 'STRING_AGG' is not a valid windowing function".

    I believe the solution is to wrap the main query up in a CTE, calculate IdAggregate using GROUP BY returnmessageid in another CTE, and then join the two in the final select.

    WITH InitialQuery AS (
        SELECT OS.returnmessageid,
               OS.tagconfig,
               CASE
                 WHEN OP.messagetype = 202
                      AND OS.datapointid IN ( 1, 3 ) THEN
                 CONVERT(NVARCHAR(max), OS.lookupvalue)
                 ELSE Try_convert(Nvarchar(max), OS.value)
               END AS Value,
               OS.unitnumber,
               OD.messageutc,
               OP.messagetype,
               OS.isproccessed,
               OS.id,
               Row_number()
                 OVER (
                   partition BY OS.returnmessageid, OS.unitnumber, OD.messageutc,
                 OP.messagetype
                   ORDER BY OS.returnmessageid, OS.unitnumber, OD.messageutc,
                 OP.messagetype )
                   AS RowNum
        FROM   ooocommreturnmessagesscaling OS
               LEFT JOIN ooocommreturnmessagesparsing OP
                      ON OP.id = OS.returnmessageid
               LEFT JOIN ooocommreturnmessagedetails OD
                      ON OP.returnmessageid = OD.id
    ),
    IdAgg AS (
        SELECT
            Q.returnmessageid,
            STRING_AGG(Q.id, ',') WITHIN GROUP(ORDER BY Q.id) AS IdAggregate
        FROM InitialQuery Q
        GROUP BY Q.returnmessageid
    )
    SELECT Q.*, I.IdAggregate
    FROM InitialQuery Q
    JOIN IdAgg I ON I.returnmessageid = Q.returnmessageid
    

    Here are the results from a simplified version with some generated test data:

    returnmessageid OtherData id IdAggregate
    1 Other Data 101 101,102,103
    1 Other Data 102 101,102,103
    1 Other Data 103 101,102,103
    2 Other Data 104 104,105,106,107,108
    2 Other Data 105 104,105,106,107,108
    2 Other Data 106 104,105,106,107,108
    2 Other Data 107 104,105,106,107,108
    2 Other Data 108 104,105,106,107,108
    3 Other Data 109 109,110
    3 Other Data 110 109,110

    See this db<>fiddle for a demo.