rustgraph-algorithmhashsetcliqueclique-problem

Why do I obtain different results for the Bron-Kerbosch algorithm when switching between a BTreeSet and a HashSet?


I've been trying to implement the Bron-Kerbosch algorithm in Rust for my master's thesis. So far everything was working, but when I tried changing from a BTreeSet to a HashSet for performance comparison purposes, the behaviour became completely random (the results are at least).

I can't find any information about the order of the nodes having any influence on the result, yet, changing to an unordered collection breaks the results as the algorithm seems to miss some branches during the backtracking.

use std::collections::hash_map::Entry::{Occupied, Vacant};
use std::collections::{BTreeSet, HashMap};

type Nodes = BTreeSet<u32>;
type Graph = HashMap<u32, Nodes>;
type Record = (u32, u32);

fn init_nodes(records: &[Record]) -> Graph {
    let mut nodes: Graph = Graph::with_capacity(records.len());
    for record in records.iter() {
        let n: &mut Nodes = match nodes.entry(record.0) {
            Vacant(entry) => entry.insert(Nodes::new()),
            Occupied(entry) => entry.into_mut(),
        };
        n.insert(record.1);
        nodes.entry(record.1).or_insert_with(Nodes::new);
    }
    nodes.shrink_to_fit();
    nodes
}

fn bron1(graph: &Graph, r: Nodes, mut p: Nodes, mut x: Nodes, cliques: &mut Vec<Nodes>) {
    if p.is_empty() && x.is_empty() {
        cliques.push(r);
    } else if !p.is_empty() {
        let nodes = p.iter().cloned().collect::<Nodes>();
        nodes.iter().for_each(|node| {
            let neighbours: &Nodes = graph.get(node).unwrap();
            let mut to_add: Nodes = Nodes::new();
            to_add.insert(*node);
            bron1(
                graph,
                r.union(&to_add).cloned().collect(),
                p.intersection(&neighbours).cloned().collect(),
                x.intersection(&neighbours).cloned().collect(),
                cliques,
            );
            p.remove(node);
            x.insert(*node);
        });
    }
}

fn display_cliques(cliques: &[Nodes]) {
    let max = (&cliques[0]).len();
    let mut count = 0;
    for (idx, cl) in cliques.iter().enumerate() {
        if cl.len() != max {
            count = idx;
            break;
        }
    }
    println!(
        "Found {} cliques of {} nodes on a total of {} cliques",
        count,
        max,
        cliques.len()
    )
}

fn main() {
    let records: Vec<Record> = vec![
        (1, 88160),
        (1, 118_052),
        (1, 161_555),
        (1, 244_916),
        (1, 346_495),
        (1, 444_232),
        (1, 447_165),
        (1, 500_600),
        (2, 27133),
        (2, 62291),
        (2, 170_507),
        (2, 299_250),
        (2, 326_776),
        (2, 331_042),
        (2, 411_179),
        (2, 451_149),
        (2, 454_888),
        (4, 16050),
        (4, 286_286),
        (4, 310_803),
        (4, 320_519),
        (4, 408_108),
        (4, 448_284),
        (5, 173_362),
    ];

    let nodes = init_nodes(&records);
    let r: Nodes = nodes.keys().copied().collect();
    let mut cliques: Vec<Nodes> = Vec::new();
    bron1(&nodes, Nodes::new(), r, Nodes::new(), &mut cliques);
    cliques.sort_unstable_by(|a, b| a.len().cmp(&b.len()).reverse());
    display_cliques(&cliques);
}

Playground

Running the code using BTreeSet gives the right results.

Found 24 cliques of 2 nodes on a total of 48 cliques

Changing the Nodes type to HashSet produces results that are completely different.

Found 5 cliques of 2 nodes on a total of 29 cliques

Solution

  • The order does not matter, and the program should work no matter if it uses HashSets or BTreeSets.

    The init_nodes function is incorrect because the Bron-Kerbosch algorithm works on undirected graphs, yet, the init_nodes function does not register the edges both ways, which makes the graph directed and results in the fact that the order matters.

    Here is a correct implementation of the function:

    fn init_nodes(records: &[Record]) -> Graph {
        let mut nodes: Graph = Graph::with_capacity(records.len());
        for r in records.iter() {
            let n: &mut Nodes = match nodes.entry(r.0) {
                Vacant(entry) => entry.insert(Nodes::new()),
                Occupied(entry) => entry.into_mut(),
            };
            n.insert(r.1);
            let n: &mut Nodes = match nodes.entry(r.1) {
                Vacant(entry) => entry.insert(Nodes::new()),
                Occupied(entry) => entry.into_mut(),
            };
            n.insert(r.0);
        }
        nodes.shrink_to_fit();
        nodes
    }