I have a dataset as follows:
;WITH CTE AS
( SELECT *
FROM (VALUES
(1, 10, 20, 30)
(2, 10, 21, 31)
(3, 11, 21, 31)
(4, 12, 22, 32)
(5, 13, 23, 33)
(6, 14, 24, 33)
(7, 14, 25, 34)
(8, 15, 26, 36)
) AS MyValues(ID, GroupID1, GroupID2, GroupID3)
)
SELECT *
FROM CTE
How can I generate the following collapsing the individual groups into a single group?
ID | SingleGroupID |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 3 |
8 | 4 |
This is a typical graph-walking problem, where you want to build islands of id
s that have at least one group in common.
We can start by unpivoting the groups to build a list of nodes, and generate edges with a self-join (edges connect id
s that share the same group). We can then recursively traverse the edges, while keeping track of the path we followed until there. The last step is to aggregate by id.
So:
with
nodes as (
select t.id, v.grp
from mytable t
cross apply ( values (t.GroupID1), (t.GroupID2), (t.GroupID3) ) v(grp)
),
edges as (
select distinct n1.id as id1, n2.id as id2
from nodes n1
inner join nodes n2 on n1.grp = n2.grp
),
rec as (
select id1, id2, cast(id1 as nvarchar(max)) as visited from edges
union all
select r.id1, e.id2, concat(r.visited, ',', e.id2)
from rec r
inner join edges e on e.id1 = r.id2
where concat(',', r.visited, ',') not like concat('%,', e.id2, ',%')
),
fin as (
select id1, min(value) min_id
from rec r
cross apply string_split(r.visited, ',')
group by id1
)
select id1 as id, dense_rank() over(order by min_id) grp
from fin f
id | grp |
---|---|
1 | 1 |
2 | 1 |
3 | 1 |
4 | 2 |
5 | 3 |
6 | 3 |
7 | 3 |
8 | 4 |