javaalgorithmgraphtarjans-algorithm

Tarjans algorithm wrongly detecting cycles


Whenever I run the tarjans algorithm on any graph it always claims to have a cycle, for example this graph:

A -> B -> C

The algorithm will tell me there is a cycle:

[a]
[b]

When there is a cycle, for example:

A -> B -> C -> A

The output is quite strange:

[c, b, a]
[a]
[b]

Here's my implementation:

import java.util.ArrayDeque;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.stream.Collectors;

public class Tarjans {

    private static class Node {
        public int index = -1, lowLink = -1;
        public String name;

        public Node(String name) {
            this.name = name;
        }

        public String toString() {
            return name;
        }
    }

    HashMap<String, Node> nodes = new HashMap<>();
    HashMap<String, ArrayList<Node>> graph = new HashMap<>();

    private int index = 0;
    private ArrayDeque<Node> visited = new ArrayDeque<>();
    private HashSet<String> stack = new HashSet<>();

    public ArrayList<ArrayList<Node>> tarjan() {
        ArrayList<ArrayList<Node>> cycles = new ArrayList<>();
        for (String key : graph.keySet()) {
            Node n = nodes.get(key);
            if (n == null) {
                System.err.println("what is " + n + "?");
                return new ArrayList<ArrayList<Node>>();
            }

            ArrayList<Node> cycle = strongConnect(n);
            if (cycle.size() > 0) {
                cycles.add(cycle);
            }
        }
        return cycles;
    }

    private ArrayList<Node> strongConnect(Node node) {
        node.index = index;
        node.lowLink = index;
        index += 1;

        visited.push(node);
        stack.add(node.name);

        ArrayList<Node> neighbours = graph.get(node.name);
        if (neighbours == null) return new ArrayList<>();

        neighbours.forEach(n -> {
            if (n.index == -1) {
                strongConnect(n);
                node.lowLink = Math.min(node.lowLink, n.lowLink);
            }
            else if (stack.contains(n.name)) {
                node.lowLink = Math.min(node.lowLink, n.index);
            }
        });

        ArrayList<Node> cycle = new ArrayList<>();
        if (node.lowLink == node.index) {
            Node p = null;
            do {
                p = visited.pop();
                stack.remove(p.name);
                cycle.add(p);
            } while (p != node);
        }
        return cycle;
    }

    private void foo() {
        nodes.put("a", new Node("a"));
        nodes.put("b", new Node("b"));
        nodes.put("c", new Node("c"));

        // A -> B -> C -> A
        graph.put("a", new ArrayList<>(Arrays.asList(nodes.get("b"))));
        graph.put("b", new ArrayList<>(Arrays.asList(nodes.get("c"))));
        graph.put("c", new ArrayList<>(Arrays.asList(nodes.get("a"))));

        ArrayList<ArrayList<Node>> cycles = tarjan();
        for (ArrayList<Node> cycle : cycles) {
            System.out.println("[" + cycle.stream().map(Node::toString).collect(Collectors.joining(",")) + "]");
        }
    }

    public static void main(String[] args) {
        new Tarjans().foo();
    }

}

But I'm not sure where I'm going wrong. I've followed the wikipedia article on tarjans algorithm nearly 1:1 and the psuedocode. I'm very new to graph theory and graph algorithms, so I can't wrap my head around what is the mistake here.

fix for tarjan()

public ArrayList<ArrayList<Node>> tarjan() {
    ArrayList<ArrayList<Node>> cycles = new ArrayList<>();
    for (Node n : nodes.values()) {
        if (n == null) {
            System.err.println("what is " + n + "?");
            return new ArrayList<ArrayList<Node>>();
        }

        if (n.index == -1) {
            ArrayList<Node> cycle = strongConnect(n);
            if (cycle.size() > 0) {
                cycles.add(cycle);
            }   
        }
    }
    return cycles;
}

Solution

  • From the first revision of the code presented in the question, the problems boil down to nearly not quite being near enough: I've followed the wikipedia article on [Tarjan's Strongly Connected Components] algorithm nearly 1:1 and the pseudocode.
    (And possibly naming (variables to hold) strongly connected component cycle: if edges (a, b), (a, c), (b, a), (b, c) and (c, a) belong to one graph, vertices/nodes a, b, and c are in one strongly connected component which is neither a cycle nor cycles that happen to (pairwise) share vertices.)

    There has been calling strongConnect() for nodes already visited - fixed in revison 7.
    As of revison 7, there still is not checking a node for qualifying as a strongly connected component whenever it has no neighbours/successors.
    Handling a strongly connected component once found is not as easy as it could be: have a Set<Set<Node>> as a data member "of the algorithm(instance)" to just add it to.

    Once you got your implementation working and the code cleaned up and commented, I suggest presenting it at CODE REVIEW: there are lots of opportunities to make everyone's life (as a (Java) coder) easier, starting with yours.