algorithmgraphdirected-acyclic-graphslowest-common-ancestor

Set of Least Common Ancestors in a Directed Acyclic Graph


I have a hierarchy of nodes that I need to use for an analysis. Sort of like this

enter image description here

I'm trying to find an algorithm that will allow me to find the nearest common ancestors between two nodes. I know there are algorithms that find the lowest common ancestor, but I haven't been able to find one that allows us to find the nearest few.

For example, in the picture I linked above, if I give it two nodes: 0 and 1, it should return 2 and 5. i.e. It should return all common ancestors of the nodes that don't have descendants that are also common ancestors. A naive approach would be to get all the common ancestors of 0 and 1: {7, 5, 6, 3, 2}, and then eliminate 7 since it has descendants in the set. Then it'll eliminate 6 and 3 as well. So it would return SLCA = {5,2}. At the moment, I've stored all the ancestors of each node in a list. So this naive approach is possible. However, I'd like do a more efficient method that will be scalable for even very large graphs. Ultimately, for a given graph, I'd like to be able to compute the pairwise SLCA of each pair of nodes. I think this brute force approach would be slow for very large graphs.

Does anyone know of such an algorithm? I've been reading these papers, but they are pretty dense, and I've been stuck trying to understand them.

https://www.dcs.warwick.ac.uk/~czumaj/PUBLICATIONS/DRAFTS/LCA-Max-witness.pdf

http://www.ccs.neu.edu/home/dherman/browse/projects/mini-javac/papers/bender01finding.pdf

https://algo2.iti.kit.edu/download/fischer10new.pdf

https://algo2.iti.kit.edu/download/fischer10new.pdf

I appreciate the help


Solution

  • Well actually your "brute force" algorithm is quite efficient. Let's analyze it:

    Finding all of the common ancestors, you can do this with two BFS and an array that keep the node which are in both trees: Time - O(V + E), Memory: O(V).

    Now you can find all of the ancestors which do not have descendants in the group, which are all the nodes that are the closest to the roots. This will not take long, take the sub graph of this group and find a node with no incoming edges in O(V + E) time (By iterating over the nodes), and O(V) memory. This will be your answer.

    So in total - Time O(V + E), Memory O(V).

    The next answer is not the correct answer, it finds the closest common ancestor (I have written it by mistake and I don't really want to delete it):

    Duplicate the graph, now we have G1 and G2. For each node in G1 create a new edge over to the corresponding node in G2. 'Flip' all the edges in G2 - if there was an edge from v to u now it's from u to v.

    Call this new graph G, call the two nodes you need to find the SLCA u and v.

    It is easy to show that the shortest route from u in G1 to the corresponding v in G2 will pass through an edge from G1 to G2 and that the node of this edge (both nodes on this edge are corresponding) will be in the SLCA group.
    That is since if you look at the original graph, the paths we took from each node v, u is the shortest to meet, which is the definition of the SLCA group.

    Now you need to find all the path of the shortest path length and extract all of these nodes.

    Time: O(E + V) (shortest path - BFS)
    Memory: O(V) (BFS)