javaalgorithmdata-structuresdisjoint-sets

Testing for a circuit when implementing Kruskalls algorithm


I'm trying to write a program that would find the minimum spanning tree. But one problem I am having with this algorithm, is testing for a circuit. What would be the best way to do this in java.

Ok here is my code

import java.io.*;
import java.util.*;

public class JungleRoads 
{
    public static int FindMinimumCost(ArrayList graph,int size)
    {
        int total = 0;
        int [] marked = new int[size];      //keeps track over integer in the mst

        //convert an arraylist to an array
        List<String> wrapper = graph;
        String[] arrayGraph = wrapper.toArray(new String[wrapper.size()]);
        String[] temp = new String[size];
        HashMap visited = new HashMap();


        for(int i = 0; i < size; i++)
        {
           // System.out.println(arrayGraph[i]);
            temp = arrayGraph[i].split(" ");

            //loop over connections of a current node
            for(int j =  2; j < Integer.parseInt(temp[1])*2+2; j++)
            {

                if(temp[j].matches("[0-9]+"))
                {
                    System.out.println(temp[j]);
                }
            }


        }


        graph.clear();
        return total;


    }


    public static void main(String[] args) throws IOException
    {

         FileReader fin = new FileReader("jungle.in");
        BufferedReader infile = new BufferedReader(fin);

        FileWriter fout = new FileWriter("jungle.out");
        BufferedWriter outfile = new BufferedWriter(fout);


        String line;
        line = infile.readLine();
        ArrayList graph = new ArrayList();

        do
        {

            int num = Integer.parseInt(line);
            if(num!= 0)
            {

                int size = Integer.parseInt(line)-1;

                for(int i=0; i < size; i++)
                {
                    line = infile.readLine(); 
                    graph.add(line);
                }

               outfile.write(FindMinimumCost(graph, size));
            }   


            line = infile.readLine();
        }while(!line.equals("0"));

    }
}

Solution

  • Kruskall's algorithm doesn't searche for cycles, because it's not performance efficient; instead creates disjoint trees, and then connects them. Since connecting two distinct subtrees with a single edge creates a new tree, there is no need to check for cycles.

    If you look at wiki page algorithm is as follow:

    1. create a forest **F** (a set of trees), where each vertex in the graph is a separate tree
    2. create a set S containing all the edges in the graph
    3. while S is nonempty and F is not yet spanning
        a. remove an edge with minimum weight from S
        b. if that edge connects two different trees, then add it to the forest, combining 
           two trees into a single tree
        c. otherwise discard that edge.
    

    You should use Disjoint Set Data Structure for this. again from wiki:

    first sort the edges by weight using a comparison sort in O(E log E) time; this allows the step "remove an edge with minimum weight from S" to operate in constant time. Next, we use a disjoint-set data structure (Union&Find) to keep track of which vertices are in which components. We need to perform O(E) operations, two 'find' operations and possibly one union for each edge. Even a simple disjoint-set data structure such as disjoint-set forests with union by rank can perform O(E) operations in O(E log V) time. Thus the total time is O(E log E) = O(E log V).


    #Creating Disjoint Forests Now you can take a look at Boost Graph Library-Incremental Components part. You should implement some methods: MakeSet, Find, Union, After that you can implement Kruskall's algorithm. All you doing is working with sets, and simplest possible way to do so is using linked list.

    Each set has one element named as representative element which is the first element in the set.

    1- First implement MakeSet by linked lists:

    This prepares the disjoint-sets data structure for the incremental connected components algorithm by making each vertex in the graph a member of its own component (or set).

    Just initialize each vertex (element) as a representative element of new set, we can do this by setting them as themselves' parent:

     function MakeSet(x)
       x.parent := x
    

    2- Implement Find method:

    Find representative element of a set that contains a vertex x:

     function Find(x)
     if x.parent == x
        return x
     else
        return Find(x.parent)
    

    The if part checks the element is representative element or not. we set all representative elements of sets as their first element by setting them as themselves parent.

    3- And finally when all previous steps are done, simple part is implementing the Union method:

    function Union(x, y)
     xRoot := Find(x) // find representative element of first element(set)
     yRoot := Find(y) // find representative element of second element(set)
     xRoot.parent := yRoot // set representative element of first set 
                           // as same as representative element of second set
    

    Now how you should run Kruskall?

    First put all nodes in n disjoint sets by MakeSet method. In each iteration after finding desired edge (not marked and minimal one), find related sets of its endpoint vertices by Find method (their representative elements), if they are the same, drop this edge out because this edge causes a cycle, but If they are in different sets, use Union method to merge these sets. Since each set is a tree their union is a tree.

    You can optimize this by choosing better data structure for disjoint sets, but for now I think this is enough. If you are interested in more advanced data structures, you can implement rank base method, there is a good documentation about it in wiki, it's easy but I didn't mention it to prevent from bewilderment.