c++algorithmtime-complexitydisjoint-sets

Why don't we update rank for disjoint set after path compression?


I have made a template for disjoint set with rank heuristic and path compression.

template <typename T>
class disJSet
{
    map<T,T> parent;
    map<T,int> rank;
    public:
    //Linear time complexity 
    void makeSet(vector<T> it)
    {
        for(T i:it)
        {
            parent[i]=i;
            rank[i]=0;
        }
    }

    //Time complexity of O(log*n)
    T find(T el)
    {
        if(el!=parent[el])
            parent[el]=find(parent[el]);
        
        return parent[el];
    }

    //Time complexity of O(log*n)
    bool unionOp(T a,T b)
    {
        T a_id=find(a);
        T b_id=find(b);

        if(a_id==b_id)
            return false;

        if(rank[a_id]<rank[b_id])
            parent[a_id]=b_id;
        else
        {
            parent[b_id]=a_id;
            if(rank[a_id]==rank[b_id])
            {
                rank[b_id]=rank[b_id]+1;
            }
        }
        return true;    
    }
};

Path compression is implemented in find() method. Why are we not updating rank after path compression?

My reason : Whenever there are calls to find, each intermediate node will become a leaf node due to path compression. And suppose if that is the longest subtree for the parent, it's height is now changed. But we do not update the rank for the parent/root node.

This may create difference in union operation then. For example, union of two elements will result in making one tree child of other by comparing their ranks. But ranks may not be representing the maximum heights of tree due to calls to find().


Solution

  • You can't update rank after path compression, because there may be other paths to that root which are longer than the new path length.

    And you don't need to update the rank after path compression, because it only needs to represent an upper bound on the path length.