Consider the following polymorphic DFS function from here:
-- | Depth-first search.
--
-- Generates the list of unique visited states from a
-- given starting state. States are unique up to the
-- characterizing function.
dfsOnN ::
Ord r =>
(a -> r) {- ^ state characterization -} ->
(a -> [a]) {- ^ successors function -} ->
[a] {- ^ initial states -} ->
[a] {- ^ visited states in depth-first order -}
dfsOnN rep next = loop S.empty
where
loop _ [] = []
loop !seen (x:xs)
| S.member r seen = loop seen xs
| otherwise = x : loop seen1 (next x ++ xs)
where
r = rep x
seen1 = S.insert r seen
Is it possible to detect cycles with this? I'm guessing no because of the way we handle the "already-seen" case. If not, is there a way we can generalize it so it retains all of its existing functionality, but can also detect cycles?
For concreteness, let us say we have:
neighbors :: Int -> [Int]
vertices :: [Int]
And we want to call it as dfsOnN id neighbors vertices
and somehow use the output to detect cycles.
A DFS that outputs a list discards a lot of information that is learned by a DFS. A "DFS tree" is a more informative output for a DFS. A DFS tree is a spanning tree of the visited nodes, such that edges of the original graph that do not belong to the tree can be classified as either back edges or cross edges depending on whether they form a directed cycle with edges of the tree (the linked Wikipedia article also defines a class forward edges but here we count them as cross edges for simplicity).
In the case of a DFS with multiple starting vertices, you get a forest of DFS trees.
data DfsTree a
= Tree a [DfsTree a]
| BackEdge a -- ancestor in the tree
| CrossEdge a -- already visited but not an ancestor in the tree
deriving (Show)
To construct a DFS tree (or forest), we must keep track of (1) previously visited
vertices, and (2) ancestors
of the current vertex (which is a subset of visited
). The main difference between visited
and ancestors
is that the inner recursive function dfsFrom
returns the updated visited
, whereas ancestors
is discarded so its changes don't persist across children of each vertex.
dfs :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [DfsTree a]
dfs rep next initial = forest
where
(_s, forest) = dfsForestFrom Set.empty Set.empty initial
-- Definition using mapAccumL:
-- dfsForestFrom ancestors visited = mapAccumL (dfsFrom ancestors) visited
-- or use the State monad.
dfsForestFrom ancestors visited [] = (visited, [])
dfsForestFrom ancestors visited (a : as) = (visited2, tree : trees)
where
(visited1, tree) = dfsFrom ancestors visited a
(visited2, trees) = dfsForestFrom ancestors visited1 as
dfsFrom ancestors visited a
| ra `Set.member` ancestors = (visited, BackEdge a)
| ra `Set.member` visited = (visited, CrossEdge a)
| otherwise =
let (visited', branches) = dfsForestFrom (Set.insert ra ancestors) (Set.insert ra visited) (next a) in
(visited', Tree a branches)
where ra = rep a
Many operations can then be defined by structural recursion on the DFS tree. dfs
guarantees that each reachable vertex appears exactly once in a Tree
constructor, that the list of branches
in Tree x branches
has as many branches as the adjacency list next x
, and that every edge corresponds to one constructor (Tree
, BackEdge
, or CrossEdge
). In other words, a DFS tree is a "cycle-free" representation of the graph.
For example, the code below implements cycle finding and connected component enumeration (only for undirected graphs (where the adjacency matrix is symmetric); connected components are trickier to compute for directed graphs, but it is still possible to do so without modifying dfs
).
A cycle is simply indicated by the presence of a back edge. To make the exercise a bit more interesting, findCycle
also returns the vertices in the cycle.
findCycle :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> Maybe [a]
findCycle rep next initial = findMaybe (findCycleTree rep []) forest
where
forest = dfs rep next initial
findCycleTree :: Eq r => (a -> r) -> [a] -> DfsTree a -> Maybe [a]
findCycleTree rep ancestors (Tree a branches) = findMaybe (findCycleTree rep (a : ancestors)) branches
findCycleTree rep ancestors (BackEdge a) = Just (a : reverse (takeWhile (\a' -> rep a /= rep a') ancestors))
findCycleTree _ _ (CrossEdge _) = Nothing
findMaybe :: (a -> Maybe b) -> [a] -> Maybe b
findMaybe f = foldr ((<|>) . f) Nothing
Connected components in an undirected graph simply correspond to each tree of the DFS forest. connected
just flattens the trees into lists, ignoring back edges and cross edges.
-- Assumes undirected graph: x ∈ next y <-> y ∈ next x
connected :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [[a]]
connected rep next initial = [c | c@(_ : _) <- collectComponent <$> forest]
where
forest = dfs rep next initial
collectComponent (Tree a branches) = a : (branches >>= collectComponent)
collectComponent (BackEdge _) = []
collectComponent (CrossEdge _) = []
Full code with test cases:
import Control.Applicative ((<|>))
import Data.Traversable (mapAccumL)
import qualified Data.Map as Map
import Data.Map (Map)
import qualified Data.Set as Set
-- General DFS, outputs a DFS forest
-- Reference: https://en.wikipedia.org/wiki/Depth-first_search#Output_of_a_depth-first_search
data DfsTree a
= Tree a [DfsTree a]
| BackEdge a
| CrossEdge a
deriving (Show)
dfs :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [DfsTree a]
dfs rep next initial = forest
where
(_s, forest) = dfsForestFrom Set.empty Set.empty initial
-- Definition using mapAccumL:
-- dfsForestFrom ancestors visited = mapAccumL (dfsFrom ancestors) visited
-- or use the State monad.
dfsForestFrom ancestors visited [] = (visited, [])
dfsForestFrom ancestors visited (a : as) = (visited2, tree : trees)
where
(visited1, tree) = dfsFrom ancestors visited a
(visited2, trees) = dfsForestFrom ancestors visited1 as
dfsFrom ancestors visited a
| ra `Set.member` ancestors = (visited, BackEdge a)
| ra `Set.member` visited = (visited, CrossEdge a)
| otherwise =
let (visited', branches) = dfsForestFrom (Set.insert ra ancestors) (Set.insert ra visited) (next a) in
(visited', Tree a branches)
where ra = rep a
-- Finding cycles
findCycle :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> Maybe [a]
findCycle rep next initial = findMaybe (findCycleTree rep []) forest
where
forest = dfs rep next initial
findCycleTree :: Eq r => (a -> r) -> [a] -> DfsTree a -> Maybe [a]
findCycleTree rep ancestors (Tree a branches) = findMaybe (findCycleTree rep (a : ancestors)) branches
findCycleTree rep ancestors (BackEdge a) = Just (a : reverse (takeWhile (\a' -> rep a /= rep a') ancestors))
findCycleTree _ _ (CrossEdge _) = Nothing
findMaybe :: (a -> Maybe b) -> [a] -> Maybe b
findMaybe f = foldr ((<|>) . f) Nothing
exampleCycle :: Maybe [Int]
exampleCycle = findCycle rep next initial
where
rep = id
next 0 = [1]
next 1 = []
next 2 = [3,5]
next 3 = [4]
next 4 = []
next 5 = [6,7]
next 6 = [2]
next _ = []
initial = [0,2]
-- Finding connected components
-- Assumes undirected graph: x ∈ next y <-> y ∈ next x
connected :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [[a]]
connected rep next initial = [c | c@(_ : _) <- collectComponent <$> forest]
where
forest = dfs rep next initial
collectComponent (Tree a branches) = a : (branches >>= collectComponent)
collectComponent (BackEdge _) = []
collectComponent (CrossEdge _) = []
exampleConnected :: [[Int]]
exampleConnected = connected rep next initial
where
rep = id
next 0 = [1]
next 1 = [0]
next 2 = [3,4]
next 3 = [2,4]
next 4 = [2,3]
next 5 = [6]
next 6 = [5]
next _ = []
initial = [0,2,5]
main :: IO ()
main = do
print exampleCycle -- cycle: [2,5,6]
print exampleConnected -- connected components: [[0,1],[2,3,4],[5,6]]