algorithmunion-find

Union-Find performance explanation


This is with reference to the book Algorithm (Fouth Edition) by Robert Sedgewick & kevin Wayne

For Union-find implementation, the code for find and union is given as (page 222) as:

public int find(int p)
{
  return id[p]; 
}

public void union(int p, int q)
{ 
 // Put p and q into the same component.

 int pID = find(p);
 int qID = find(q);

// Nothing to do if p and q are already in the same component.

 if (pID == qID) return;

// Rename p’s component to q’s name.

 for (int i = 0; i < id.length; i++)
      if (id[i] == pID) id[i] = qID;
 count--;
}

And then follows a proposition:

Proposition F. The quick-find algorithm uses one array access for each call to find() and between N + 3 and 2N + 1 array accesses for each call to union() that combines two components.

I want to understand how did we actually arrive at N + 3 and 2N + 1. I have some idea about N + 3, but no idea about 2N + 1. Please help.


Solution

  • For the case that pID != qID we have:

    2 accesses for:

    int pID = find(p);
    int qID = find(q);
    

    and N accesses in the loop at the if-condition:

    if (id[i] == pID)
    

    so far N+2, but since pID != qID at least p has pID!=qID so we will access one more time inside if statement: id[i] = qID; so we will access array at least N+3 times.

    In worst case that all N element have id pID we will execute id[i] = qID; N-1 times (not just once as before, N-1 because q has qID so we will not access array for q ) so overall: 2 + N (if condition in the for-loop for all N elements access) + N-1 (for id[i] = qID; ) = 2N+1.

    Example:

    if the array looks like: qID qID qID pID qID (5 elements)

    then calling union(1,4) //1 index is the 1st element(with qID), 4 is the 4th(pID)

    you will have 2 accesses in the beginning, 5 for if condition and only one if condition is true- for the 4 the element which is the only one with pID so you have 2+5+1 =8= N+3 the minimum.

    Now for example with max accesses take the array: qID pID pID pID pID now you will have 2+5 + 4(because four conditions are true) = 11 = 2N+1 (N=8).