sql-serversql-server-2012sql-server-2008-r2sql-server-2016

Remove duplicate while concatenating other columns values


My data table looks as below (Table1)

            Table1
            ------------------------------------------
            ID  | IPaddress     | Port  | Value
            ----|---------------|-------|-------------      
            1   | xx.yy.14.15   | 332   | This is good
            ------------------------------------------      
            2   | xx.yy.14.15   | 332   | I can work
            ------------------------------------------
            3   | xx.yy.12.12   | 400   | Looks ok
            ------------------------------------------
            4   | xx.yy.12.12   | 400   | can I work
            ------------------------------------------
            5   | xx.yy.12.12   | 400   | Yes, please
            -------------------------------------------
            6   | xx.yy.14.16   | 401   | How is this
            -------------------------------------------
            7   | xx.yy.14.16   | 401   | Looks ok
            -------------------------------------------
            8   | xx.yy.14.16   | 401   | can I work
            -------------------------------------------
            9   | xx.yy.14.16   | 401   | Yes, please
            -------------------------------------------

The desired result table:

            ID  | IPaddress     | Port  | Value
            ----|---------------|-------|-----------------------------------------------------------        
            1   | xx.yy.14.15   | 332   | This is good and I can work
            --------------------------------------------------------------------------------------      
            2   | xx.yy.12.12   | 400   | Looks ok and can I work and Yes, please
            ---------------------------------------------------------------------------------------
            3   | xx.yy.14.16   | 401   | How is this and Looks ok and can I work and Yes, please
            ---------------------------------------------------------------------------------------

Here is what have I tried:

            DECLARE @VAR1 VARCHAR(50)
            DECLARE @VAR2 VARCHAR(50)

            SELECT @VAR1 = T1.VALUE,@VAR2=T2.VALUE
            FROM TABLE1 AS T1 INNER JOIN TABLE1 AS T2 ON T1.ID =T2.ID
            WHERE T1.IPADDRESS =T2.IPADDRESS

            SELECT IPADDRSS,PORT,@VAR1 + ' AND ' +@VAR2 FROM
              SELECT T1.*,
              ROW_NUMBER() OVER (PARTITION BY T1.IPADDRESS,T1.PORT ORDER BY VALUE) AS NM
              FROM TABLE1 AS T1
              )TBL
            WHERE NM = 1

However, from the above query, I'm able to achieve the desired output if there are only 2 duplicate rows (NOTE: here I'm considering IPADDRESS and PORT as duplicate, while other columns are not duplicate).

However, how can I achieve my desired result when the there are same IPADDRESS and PORT are there in 3 or 4 or 5 rows ? Please note that, the number of rows having same IPADDRESS and PORT is dynamic and it may go more than 10 sometime.

So, how can I handle such dynamic situation while getting my desired result?


Solution

  • Just saw your comment re: SQL 2017. Shoulda said that up front, I've been working on an old-school aggregation technique that was griefing me. SQL 2017 gave us the long-overdue string_agg function, which makes it dead easy:

    SELECT
       row_number() over (order by IPaddress, Port) ID
      ,IPaddress
      ,Port
      ,string_agg(Value, ' and ')
     from Table1
     group by 
       IPaddress
      ,Port
    

    You may need to poke at it a bit if ordering is critical.

    @KeithL's version works too, with a bit of debugging...which I see you've just worked out. Me, I just don't like XML very much, is why I was working on an alternate.