I have a graph problem where we have:
More formally: Let the actual vertice network be a tree T. Given just the shortest path distances of T, you have to reconstruct the original network T.
Input: An n × n distance matrix H with Hi,j = δT (i,j), where T is the actual network of vertices and δT is the shortest path distance between vertice i and j in T.
Output: The n −1 edges of T.
Example:
•T is the actual vertice network.
•H is the n × n shortest path distance matrix.
•G(H) is the complete graph on n nodes, where edge (i,j) has weight Hi,j – i.e. the shortest path distance in T.
My question about Time Complexity:
What is the running time of the algorithm resulting from running the Prim algorithm on the input and returning the list of edges as a function of n? (Note that |E(G(H))|= Θ(n2)). Should Amortized analysis be used here? Im not really sure.
The time complexity of Prim's algorithm using the adjacency matrix of a complete graph is Theta(n^2)
. We can see this from the pseudocode of Prim's algorithm with our adjacency matrix H
:
Q
of vertices not in the tree, initially all vertices. Choose the first vertex to be our root R
.n
, key
and parent
. key
will store, at position i
, the minimum weight edge connecting the i
th vertex to the current MST; initially, this is +infinity. parent
will store, after the algorithm is done, the parent of each vertex in the MST rooted at R
. Initially, parent[i] = R
for all i
, except the root, which has no parent.H
(corresponding to the root R
) and assign key[i] = H[0][i]
. Remove R
from our set Q
.Q
is not empty:
Q
, and extract any vertex u
with minimum key[u]
v
from 0 to n-1:
v
in Q
and H[u][v] < key[v]
:
key[v]
= H[u][v]
parent[v]
= u
Here, the loop in (4) runs n-1
times, and inside the loop, we do Theta(n)
work. In total, that gives a runtime of Theta(n^2)
, which is optimal for any algorithm that needs to read the entire adjacency matrix. In particular, for a generic complete graph, this is optimal, but this doesn't imply that Prim's algorithm is optimal for your specific case with a narrower class of graphs formed from distance matrices.
To show that your problem transformation is correct, we need to verify that, given a tree T
with positive weights, the complete graph G(H)
formed by taking the distance matrix of T
as an adjacency matrix will satisfy:
G(H)
has a unique minimum spanning treeT
is a minimum spanning tree of G(H)
.This requires proving several properties of minimum spanning trees in general. One theorem about minimum spanning trees, proven as Corollary 3.5 in these MIT lecture notes, says that:
Let
G = (V,E,w)
be a connected, weighted, undirected graph. LetT
be any MST and let(U, V \ U)
be any cut. ThenT
contains a light edge for the cut.
Here, a 'light edge for a cut (U, V \ U)' means an edge whose weight is the minimum weight of all edges with exactly one endpoint in U
.
Now, we just need to choose appropriate cuts to prove what we want. For an arbitrary edge e
in your original tree T
, consider the two trees T1
and T2
we get by deleting e
.
Take the vertices of T1
, which we'll call V(T1)
, as our cut. We need to show that in the complete graph G(H)
, the edge e
is the unique light edge for that cut. In our original tree T
, e
is the only edge that crosses the cut. This means that any path with one endpoint u
in V(T1)
and the other endpoint v
in V(T2)
must include e
.
Since all the weights are positive, this means that the distance in T, distance(u,v) > weight(e)
, for any u, v such that (u in V(T1), v in V(T2), and (u, v) != e)
. Since the distance in T
between u
and v
is the weight of the edge (u, v)
in our complete graph G(H)
, this means that e
is the unique minimum weight edge that crosses the cut. Since e
was an arbitrary edge in T
, this now means that all edges of T
must be in our MST for G(H)
, so the unique MST of G(H)
is T
.