algorithmunion-find

Union-find expressed as a social network


This is an interview question that I am trying to answer:

Given a social network containing N members and a log file containing M timestamps at which times pairs of members formed friendships, design an algorithm to determine the earliest time at which all members are connected (i.e, every member is a friend of a friend of a friend...of a friend). Assume that the log file is sorted by timestamp and that friendship is an equivalence relation. The running time of your algorithm should be M log N or better and use extra space proportional to N.

The first thing I thought was..."I can't do this!".

But then I thought how can this social network be expressed as a data structure. Union-find is a data structure that can be used. Now I have to understand what it means when all members are connected. How can I view the actual data structure and what it looks like when every member became friends with each other?

I think only until I can understand visually or conceptually how the system becomes fully connected can I begin to figure out how to find the timestamp that corresponds with that event.


Solution

  • When you add a friendship to the union-find datastructure, you can note if it results in two graph components being joined. Simply keep adding edges until N-1 of these merging events have happened.

    In pseudo-code form:

    G := UnionFind(1..N)
    count := 0
    for timestamp, p1, p2 in friendships {
        if G.Find(p1) != G.Find(p2) {
            G.Union(p1, p2)
            count++
            if count == N-1 {
                return timestamp
            }
        }
    }
    return +infinity