algorithmdata-structuresdirected-acyclic-graphsweighted-graph

Connecting Two Random Nodes in a Directed Acyclic Weighted Graph


Summary

So I have a directed acyclic weighted graph where each edge has an input and output node, each node has an ID and I use a dictionary to map all ingoing edges to one node by its ID and another to do the same for all outgoing edges so when presented a node ID I can tell all ingoing and outgoing edges in ~O(1) time.

Requirement

I need to be able to add new random edges (i.e. connect two random nodes) in a way where it is guaranteed that no matter how big the graph is it wont be having any cycles.

What I tried

Since I'm in full control of how to build my graph I could sort it topologically and use Kahn's Algorithm to figure if for two uniformly randomly chosen nodes N1 and N2 the graph will result in a cycle in O(n) time. The problem with that is what if it does? I'd have to try a new random pair and repeat the process until I get lucky. This sounds as if it will scale really badly with the graph since the more edge dense the graph is the more likely it is that a new random one will create a cycle.

I have also read this post: Generating a random DAG , which is similar in nature to my problem, however, I can't use the suggested solution to connect nodes based on their IDs and only connect smaller to larger IDs (nodes that came before with nodes that are new) due to other constraints that I have on the problem.

Question

Is there a way to design a structure that will only ever allow me to randomly pick between nodes none of which, if connected through a new edge, will yield a cycle irrelevant of the memory overhead? This should then be an O(1) operation.


Solution

  • I have an O(1) solution for the check if an edge can be included in the graph. However it will take you worst-case O(n) to insert the edge.

    You could maintain a binary reachability matrix R where R[u, v]=1 if you can reach v from u in your current graph and R[u, v]=0 if not. R can initially be computed once using Floyd-Warshall.

    If you want to check if you can include an edge (u,v) you now just have to check if R[v, u] = 0. If it is R[v, u] = 1 you are constructing a circle by inserting (u,v).

    The remaining problem becomes updating this structure. If you end up inserting the edge (u, v) into the graph you will set R[u, v] = 1. Additionally, all nodes that were able to reach u (R[:,u]=1) are are now able to reach all nodes reachable by v (R[v,:] = 1). Thus you will need to set your entries R[i, j] = 1 if R[i,u] = 1 and R[v:j] = 1.

    Unfortunately, the update step will take O(n) in the worst case. In practice when the graph is still rather sparse you should be able to implement this efficiently with a sparse matrix representation (hash list with indicies v for each row u where R[u, v] = 1) and be much faster.

    If you want to select a possible edge at random, you will have to additionally maintain and update a list of possible edges through the same structure.