I am working on an undirected graph which supports:
addPerson
: Inserting a node person
.addRelation
: Adding an edge between id1
and id2
.modifyRelation
: Modify/Remove an edge between id1
and id2
.isCircle
: Check if id1
and id2
are connected.queryBlockSum
: Querying the number of disjoint sets it can be divided into.public class MyNetwork implements Network {
/* ... */
private final DisjointSet disjointSet;
@Override
public void addPerson(Person person) {
/* ... */
disjointSet.add(person.getId());
}
@Override
public void addRelation(int id1, int id2, int value) {
/* ... */
disjointSet.merge(id1, id2);
}
@Override
public void modifyRelation(int id1, int id2, int value) {
/* ... */
int parent = disjointSet.find(id1);
if (/* I should remove the edge */) {
HashSet<Integer> set1 = /* all nodes connected with id1 */;
HashSet<Integer> set2 = /* all nodes connected with id2 */;
if (!set1.contains(parent)) {
disjointSet.resetParent(set1, parent, id1);
}
if (!set2.contains(parent)) {
disjointSet.resetParent(set2, parent, id2);
}
}
}
@Override
public boolean isCircle(int id1, int id2) {
return disjointSet.find(id1) == disjointSet.find(id2);
}
@Override
public int queryBlockSum() {
return disjointSet.getSetCount();
}
}
In order to boost up efficiency, I implemented the disjoint set data structure. It supports:
add
: Adding an element x
.find
: Finding the parent of an element x
(with path compression).merge
: Merge two sets x
and y
.split
: Change the parents of all elements in c
from oldParent
to newParent
, splitting one set into two. (It is assured that for any element x
in c
, find(x) == oldParent
)getSetCount
: Get the number of sets.public class DisjointSet {
private final HashMap<Integer, Integer> parent;
private int setCount;
public DisjointSet() {
this.parent = new HashMap<>();
this.setCount = 0;
}
public int getSetCount() {
return setCount;
}
public void add(int x) {
parent.put(x, x);
setCount++;
}
public int find(int x) {
int parentX = parent.get(x);
if (parentX == x) {
return x;
}
parentX = find(parentX);
parent.put(x, parentX);
return parentX;
}
public boolean merge(int x, int y) {
int parentX = find(x);
int parentY = find(y);
if (parentX != parentY) {
parent.put(parentX, parentY);
setCount--;
return true;
}
return false;
}
public void split(Collection<Integer> c, int oldParent, int newParent) {
for (Integer x : c) {
parent.put(x, newParent);
}
setCount++;
}
}
The problem is, after merging and splitting multiple times, the disjoint set becomes buggy: I find that two elements are not in the same set (i.e. have different parents), while they are supposed to be. I try to updated the parents of all elements in c
before splitting, but it doesn't work:
public void split(Collection<Integer> c, int oldParent, int newParent) {
// This doesn't fix the bug.
for (Integer x : c) {
find(x);
}
for (Integer x : c) {
parent.put(x, newParent);
}
setCount++;
}
However, if I update the parents of all elements before the split operation, the bug got fixed:
public void split(Collection<Integer> c, int oldParent, int newParent) {
// This fixes the bug.
for (Integer x : parent.keySet()) {
find(x);
}
for (Integer x : c) {
parent.put(x, newParent);
}
setCount++;
}
I don't understand why I need to update the parents of all elements, even if they don't need to be split.
What is the correct way of performing the split operation on a disjoint set? Or do I have to rebuild the disjoint set because it doesn't support splitting at all?
Disjoint sets are not suited for solving this sort of problem online. A link cut tree could be used instead.
In its simplest form, link cut trees support Link
(add edge), Cut
(delete edge), and FindRoot
operations, each in amortized O(log N)
(where N
is the number of nodes). The number of connected components can be easily maintained separately.