jsonsql-servert-sqlsql-server-2017

Remove duplicate values in JSON Array


I have this value in a field of a SQL Server Table:

["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I11","I3","I1","I31","I21","I21","I5","I4","I3","I21","I4","I23","B1","I23","I3","B1","B2","B3","I15","I15","B2","I13","I2"]

actually it is a JSON Array

ome values are present more than 1 time: I need to remove the duplicate (keeping first occurrence) therefore the result should be something like this:

["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I13","I2"]

I've tried several solutions and I seem to be close to solution but in the best solution I got the result as an array of objects instead of array of values:

[
     {
        "c": "B1"
     },
     {
        "c": "B2"
     },
     {
        "c": "B3"
     },
     {
        "c": "B4"
     },
     {
        "c": "B5"
     },
     {
        "c": "B6"
     },
     {
        "c": "I1"
     },
     {
        "c": "I11"
     },
     {
        "c": "I13"
     },
     {
        "c": "I14"
     },
     {
        "c": "I15"
     },
     {
        "c": "I16"
     },
     {
        "c": "I2"
     },
     {
        "c": "I21"
     },
     {
        "c": "I23"
     },
     {
        "c": "I24"
     },
     {
        "c": "I25"
     },
     {
        "c": "I26"
     },
     {
        "c": "I3"
     },
     {
        "c": "I31"
     },
     {
        "c": "I34"
     },
     {
        "c": "I35"
     },
     {
        "c": "I36"
     },
     {
        "c": "I4"
     },
     {
        "c": "I5"
     },
     {
        "c": "I6"
     }
  ]

How can I fix it?


Solution

  • You need to disaggregate your values using OPENJSON, remove duplicates with GROUP BY (taking the min key to preserve the order) then aggregate them back up using STRING_AGG (there is no in built JSON function to create an array of values AFAIK), so something like this:

    DECLARE @T TABLE (JsonArray NVARCHAR(MAX));
    INSERT @T (JsonArray) 
    VALUES ('["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I13","I2"]');
    
    SELECT  t.JsonArray,
            DistinctArray = JSON_QUERY(CONCAT('[', STRING_AGG(QUOTENAME(oj.Value,'"'), ',') WITHIN GROUP (ORDER BY oj.[Key]), ']'))
    FROM    @T AS t
            CROSS APPLY 
            (   SELECT  oj.Value, [Key] = MIN(CONVERT(INT, oj.[Key]))
                FROM    OPENJSON(t.JsonArray) AS oj
                GROUP BY oj.Value
            ) AS oj
    GROUP BY t.JsonArray;
    

    Example on db<>fiddle