sqlazure-sql-databaserelational-databasedatabricks-sql

Same Table value selection from same clients


I need 1 record per Id and to achieve it I tried with self join, case statement as well tried using COALESCE function but I am still getting multiple records for same client.

Below approach is mostly for "status" and "lifecycle" columns, For sys1, sys2 and sys3 columns we can consider same as is values 1 record per Id.

I need 1 line item for each Id and it should consider Not null values among 3 column sys1, sys2, sys3. Each records is unique if we see it but we always have presence as sys1 to be considered if sys1 value is null then consider sys2 value and if sys1 and sys2 value is null then consider sys3 value.

If sys1, sys2 and sys3 value is not null and are having different values then consider sys1 as preferred data

Input

Id sys1 sys2 sys3 name status lifecycle
1 11 Roy Active C
1 33 Roy
2 111 Mic
2 222 Mic Active C
2 333 Mic Active C
3 1111 Jim FC
4 11111 James
4 33333 James Incative FC

Expected Output

Id sys1 sys2 sys3 name status lifecycle
1 11 33 Roy Active C
2 111 222 333 Mic Active C
3 1111 Jim FC
4 11111 33333 James Incative FC

Solution

  • Same Table value selection from same clients

    Follow the below steps which I have tried with:

    Create a table with sample Data1 and insert the value:

    CREATE  TABLE SampleData1(
    
    Id INT,
    
    sys1 VARCHAR(10),
    
    sys2 VARCHAR(10),
    
    sys3 VARCHAR(10),
    
    name  VARCHAR(50),
    
    status  VARCHAR(20),
    
    lifecycle VARCHAR(10)
    
    );
    
    
    Insert Data into table:
    
    
    INSERT INTO SampleData (Id, sys1, sys2, sys3, name, status, lifecycle) VALUES
    (1, '11', NULL, NULL, 'Roy', 'Active', 'C'),
    (1, NULL, NULL, '33', 'Roy', NULL, NULL),
    (2, '111', NULL, NULL, 'Mic', NULL, NULL),
    (2, NULL, '222', NULL, 'Mic', 'Active', 'C'),
    (2, NULL, NULL, '333', 'Mic', 'Active', 'C'),
    (3, '1111', NULL, NULL, 'Jim', NULL, 'FC'),
    (4, '11111', NULL, NULL, 'James', NULL, NULL),
    (4, NULL, NULL, '33333', 'James', 'Incative', 'FC');
    

    Check the data inside the table:

    select* from SampleData1;
    

    enter image description here

    Run the query to get the expected output:

    SELECT
    
    Id,
    
    MAX(sys1) AS sys1,
    
    MAX(sys2) AS sys2,
    
    MAX(sys3) AS sys3,
    
    MAX(name) AS  name,
    
    MAX(CASE  WHEN  status  IS  NOT  NULL  THEN  status  END) AS  status,
    
    MAX(CASE  WHEN lifecycle IS  NOT  NULL  THEN lifecycle END) AS lifecycle
    
    FROM
    
    SampleData1
    
    GROUP  BY
    
    Id;
    

    enter image description here