I have following records:
Date Id Ethnicity Gender Age
-----------------------------------------------------
2021-03-25 001 NULL NULL 39
2021-03-25 001 Hispanic Male 39
2022-02-01 002 NULL NULL NULL
2022-02-01 002 Unknown Male 27
2022-02-01 002 Hispanic Male 27
2022-02-01 002 NULL Male NULL
2022-02-02 003 NULL Male Null
2022-02-02 003 White Male Null
I want to write a SQL query in SQL Server to concatenate the Ethnicity
, Gender
and Age
for each Date
and ID
group
2021-03-25 001 Hispanic, Male, 39
2022-02-01 002 Hispanic, Male, 27
2022-02-02 003 White, Male
Thanks in advance!
I tried to used STRING_AGG()
to concatenate these columns, but the data is duplicated such as for ID 002, I got "Unknown, Male, 27, Hispanic, Male,27, Male"
Something like this perhaps?
SELECT CONCAT_WS(', ', MAX(NULLIF(Ethnicity, 'Unknown')), MAX(NULLIF(gender, 'Unknown')), MAX(NULLIF(age, 'Unknown')))
, Date, ID
FROM (
VALUES (N'2021-03-25', N'001', NULL, NULL, 39)
, (N'2021-03-25', N'001', N'Hispanic', N'Male', 39)
, (N'2022-02-01', N'002', NULL, NULL, NULL)
, (N'2022-02-01', N'002', N'Unknown', N'Male', 27)
, (N'2022-02-01', N'002', N'Hispanic', N'Male', 27)
, (N'2022-02-01', N'002', NULL, N'Male', NULL)
, (N'2022-02-02', N'003', NULL, N'Male', NULL)
, (N'2022-02-02', N'003', N'White', N'Male', NULL)
) t (Date,Id,Ethnicity,Gender,Age)
GROUP BY date,id
This groups the dates and takes the maximum of each value while filtering out unknowns /nulls. Of course, it there are several non-null values, this might return incorrect numbers, but then you need to update the question with more details.
If you don't have CONCAT_WS, you can use CONCAT with similar effect.