rlistigraphoverlapclique

R igraph find all maximal cliques without overlapping


I am trying to find all maximal cliques in a graph, without overlapping. the function max_cliques() returns all possible maximal cliques in the graph, but I want every vertex to be included in only one clique- in the largest clique it can be part of.

for example, if the output of the max_cliques() are the following cliques:

{A,B,C}, {A,B,D}, {A,B,J,K}, {E,F,G,H}, {E,F,G,I}

I want to remove some cliques so that all the vertexes will appear in exacly one clique, so the final set will be:

{A,B,J,K}, {E,F,G,H}

A and B are included in 3 cliques, so I want to choose the cliques so that the final set will include maximum vertexes as possible. if there are two possible cliques in the same length- take a random one. (I don't mind to not include all the vertexes)

I would really appreciate an idea to solve this problem, even without going into details of cliques- the question is basically how to remove the shortest "lists" that contain overlapping elements.

thanks in advance


Solution

  • This is apparently a pretty hard problem to solve as you asking about Coverage and Independent Set problems. These are NP-complete problems. What this means is that as your graph grows computational time is going to increase exponentially.

    I think this what you are aiming for. My approach is as follows:

    1. Find cliques.
    2. Convert to incidence matrix (cliques by nodes).
    3. Multiply the incidence matrix by its transpose (%*%) this creates an adjacency matrix
    4. create graph of cliques from adjacency matrix (cliques are connected to other cliques if they share a node)
    5. Find all independent sets of vertices (this is the bottleneck)
    6. Retrieve original nodes for independent sets of cliques
    7. Find set with most nodes.

    Code

    library(igraph)  
    set.seed(8675309)  
    g <- graph_from_edgelist(matrix(sample(LETTERS[1:10], 50, replace=T), ncol = 2), directed = FALSE)  
    plot(g, edge.arrow.size=0.5)
    
    cliques <- max_cliques(g)
    
    cliqueBP <- matrix(c(rep(paste0("cl", seq_along(cliques)), sapply(cliques, length)), names(unlist(cliques))), ncol=2, )
    bp <- graph_from_edgelist(cliqueBP, directed = F)
    V(bp)$type <- grepl("cl", V(bp)$name)
    # plot(bp, layout=layout_as_bipartite)
    
    bp.ind <- t(as_incidence_matrix(bp))
    bp.adj <- bp.ind %*% t(bp.ind)
    
    bp.adj.g <- graph_from_adjacency_matrix(bp.adj, mode = "undirected")
    # plot(simplify(bp.adj.g))
    bp.adj.mis <- independent.vertex.sets(bp.adj.g)
    
    sets <- lapply(bp.adj.mis, function(x) cliqueBP[cliqueBP[,1] %in% as_ids(x), 2])
    sets[which(sapply(sets, length) == max(sapply(sets, length)))]
    
    # [[1]]
    # [1] "G" "J" "E" "I" "B" "H" "F" "D"
    # 
    # [[2]]
    # [1] "G" "J" "E" "I" "F" "C" "B" "H"
    # 
    # [[3]]
    # [1] "G" "J" "E" "I" "F" "C" "A" "H"
    # 
    # [[4]]
    # [1] "G" "B" "E" "I" "F" "C" "A" "H"
    # 
    # [[5]]
    # [1] "G" "B" "E" "I" "F" "C" "H" "J"