algorithmgraphgraph-theoryminimum-spanning-tree

How to build a Minimum Spanning Tree given a list of 200 000 nodes?


Problem

I have a list of approximatly 200000 nodes that represent lat/lon position in a city and I have to compute the Minimum Spanning Tree. I know that I need to use Prim algorithm but first of all I need a connected graph. (We can assume that those nodes are in a Euclidian plan)

To build this connected graph I thought firstly to compute the complete graph but (205000*(205000-1)/2 is around 19 billions edges and I can't handle that.

Options

Then I came across to Delaunay triangulation: with the fact that if I build this "Delauney graph", it contains a sub graph that is the Minimum Spanning Tree according and I have a total of around 600000 edges according to Wikipedia [..]it has at most 3n-6 edges. So it may be a good starting point for a Minimum Spanning Tree algorithm.

Another options is to build an approximately connected graph but with that I will maybe miss important edges that will influence my Minimum Spanning Tree.

My question

Is Delaunay a reliable solution in this case? If so, is there any other reliable solution than delaunay triangulation to this problem ?

Further information: this problem has to be solved in C.

Update

This was done sucessfully by doing Delaunay triangulation, you can see result here : https://github.com/aaalloc/ACM-Paris

Note : the code is quite ugly sometimes so be warned.


Solution

  • The Delaunay triangulation of a point set is always a superset of the EMST of these points. So it is absolutely "reliable"*. And recommended, as it has a size linear in the number of points and can be efficiently built.

    *When there are cocircular point quadruples, neither the triangulation nor the EMST are uniquely defined, but this is usually harmless.