disjoint-setsdisjoint-union

union find set algorithm return parent[v] = find_set(parent[v]); means


int find_set(int v) {
if (v == parent[v])
    return v;
return parent[v] = find_set(parent[v]); }

This is a slice of code of disjoin set algorithm can i ask what is the meaning and purpose of return parent[v] = find_set(parent[v]);? usually the return are integers,boolean,or other data types.


Solution

  • That line of code (parent[v] = find_set(parent[v]);) is an heuristic optimization called path compression, a simple and highly effective heuristic. Path compression makes that each node on the find_set path points directly to the root. Further calls to find_set on any node in the path before will be very fast. The time complexity of this heuristic is shown below.

    As stated in the book Introduction to Algorithms (Cormen, page 571-572), 3rd Edition:

    Alone, union by rank yields a running time of O(m log n), and this bound is tight. Although we shall not prove it here, for a sequence of n MAKE-SET operations (and hence at most n - 1 UNION operations) and f FIND-SET operations, the path-compression heuristic alone gives a worst-case running time of O(n + f * (1 + log2 + f/nn)).

    When we use both union by rank and path compression, the worst-case running time is O(m α(n)), where α(n) is a very slowly growing function, which we define in Section 21.4. In any conceivable application of a disjoint-set data structure, α(n) ≤ 4; thus, we can view the running time as linear in m in all practical situations. Strictly speaking, however, it is superlinear.

    You can learn more about it by reading that book.