haskelldepth-first-search# Can this generic DFS implementation in Haskell be used to detect cycles in an adjacency list?

Consider the following polymorphic DFS function from here:

```
-- | Depth-first search.
--
-- Generates the list of unique visited states from a
-- given starting state. States are unique up to the
-- characterizing function.
dfsOnN ::
Ord r =>
(a -> r) {- ^ state characterization -} ->
(a -> [a]) {- ^ successors function -} ->
[a] {- ^ initial states -} ->
[a] {- ^ visited states in depth-first order -}
dfsOnN rep next = loop S.empty
where
loop _ [] = []
loop !seen (x:xs)
| S.member r seen = loop seen xs
| otherwise = x : loop seen1 (next x ++ xs)
where
r = rep x
seen1 = S.insert r seen
```

Is it possible to detect cycles with this? I'm guessing no because of the way we handle the "already-seen" case. If not, is there a way we can generalize it so it retains all of its existing functionality, but can also detect cycles?

For concreteness, let us say we have:

```
neighbors :: Int -> [Int]
vertices :: [Int]
```

And we want to call it as `dfsOnN id neighbors vertices`

and somehow use the output to detect cycles.

Solution

A DFS that outputs a list discards a lot of information that is learned by a DFS. A "DFS tree" is a more informative output for a DFS. A DFS tree is a spanning tree of the visited nodes, such that edges of the original graph that do not belong to the tree can be classified as either *back edges* or *cross edges* depending on whether they form a directed cycle with edges of the tree (the linked Wikipedia article also defines a class *forward edges* but here we count them as cross edges for simplicity).

In the case of a DFS with multiple starting vertices, you get a forest of DFS trees.

```
data DfsTree a
= Tree a [DfsTree a]
| BackEdge a -- ancestor in the tree
| CrossEdge a -- already visited but not an ancestor in the tree
deriving (Show)
```

To construct a DFS tree (or forest), we must keep track of (1) previously `visited`

vertices, and (2) `ancestors`

of the current vertex (which is a subset of `visited`

). The main difference between `visited`

and `ancestors`

is that the inner recursive function `dfsFrom`

returns the updated `visited`

, whereas `ancestors`

is discarded so its changes don't persist across children of each vertex.

```
dfs :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [DfsTree a]
dfs rep next initial = forest
where
(_s, forest) = dfsForestFrom Set.empty Set.empty initial
-- Definition using mapAccumL:
-- dfsForestFrom ancestors visited = mapAccumL (dfsFrom ancestors) visited
-- or use the State monad.
dfsForestFrom ancestors visited [] = (visited, [])
dfsForestFrom ancestors visited (a : as) = (visited2, tree : trees)
where
(visited1, tree) = dfsFrom ancestors visited a
(visited2, trees) = dfsForestFrom ancestors visited1 as
dfsFrom ancestors visited a
| ra `Set.member` ancestors = (visited, BackEdge a)
| ra `Set.member` visited = (visited, CrossEdge a)
| otherwise =
let (visited', branches) = dfsForestFrom (Set.insert ra ancestors) (Set.insert ra visited) (next a) in
(visited', Tree a branches)
where ra = rep a
```

Many operations can then be defined by structural recursion on the DFS tree. `dfs`

guarantees that each reachable vertex appears exactly once in a `Tree`

constructor, that the list of `branches`

in `Tree x branches`

has as many branches as the adjacency list `next x`

, and that every edge corresponds to one constructor (`Tree`

, `BackEdge`

, or `CrossEdge`

). In other words, a DFS tree is a "cycle-free" representation of the graph.

For example, the code below implements cycle finding and connected component enumeration (only for undirected graphs (where the adjacency matrix is symmetric); connected components are trickier to compute for directed graphs, but it is still possible to do so without modifying `dfs`

).

A cycle is simply indicated by the presence of a back edge. To make the exercise a bit more interesting, `findCycle`

also returns the vertices in the cycle.

```
findCycle :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> Maybe [a]
findCycle rep next initial = findMaybe (findCycleTree rep []) forest
where
forest = dfs rep next initial
findCycleTree :: Eq r => (a -> r) -> [a] -> DfsTree a -> Maybe [a]
findCycleTree rep ancestors (Tree a branches) = findMaybe (findCycleTree rep (a : ancestors)) branches
findCycleTree rep ancestors (BackEdge a) = Just (a : reverse (takeWhile (\a' -> rep a /= rep a') ancestors))
findCycleTree _ _ (CrossEdge _) = Nothing
findMaybe :: (a -> Maybe b) -> [a] -> Maybe b
findMaybe f = foldr ((<|>) . f) Nothing
```

Connected components in an undirected graph simply correspond to each tree of the DFS forest. `connected`

just flattens the trees into lists, ignoring back edges and cross edges.

```
-- Assumes undirected graph: x ∈ next y <-> y ∈ next x
connected :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [[a]]
connected rep next initial = [c | c@(_ : _) <- collectComponent <$> forest]
where
forest = dfs rep next initial
collectComponent (Tree a branches) = a : (branches >>= collectComponent)
collectComponent (BackEdge _) = []
collectComponent (CrossEdge _) = []
```

Full code with test cases:

```
import Control.Applicative ((<|>))
import Data.Traversable (mapAccumL)
import qualified Data.Map as Map
import Data.Map (Map)
import qualified Data.Set as Set
-- General DFS, outputs a DFS forest
-- Reference: https://en.wikipedia.org/wiki/Depth-first_search#Output_of_a_depth-first_search
data DfsTree a
= Tree a [DfsTree a]
| BackEdge a
| CrossEdge a
deriving (Show)
dfs :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [DfsTree a]
dfs rep next initial = forest
where
(_s, forest) = dfsForestFrom Set.empty Set.empty initial
-- Definition using mapAccumL:
-- dfsForestFrom ancestors visited = mapAccumL (dfsFrom ancestors) visited
-- or use the State monad.
dfsForestFrom ancestors visited [] = (visited, [])
dfsForestFrom ancestors visited (a : as) = (visited2, tree : trees)
where
(visited1, tree) = dfsFrom ancestors visited a
(visited2, trees) = dfsForestFrom ancestors visited1 as
dfsFrom ancestors visited a
| ra `Set.member` ancestors = (visited, BackEdge a)
| ra `Set.member` visited = (visited, CrossEdge a)
| otherwise =
let (visited', branches) = dfsForestFrom (Set.insert ra ancestors) (Set.insert ra visited) (next a) in
(visited', Tree a branches)
where ra = rep a
-- Finding cycles
findCycle :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> Maybe [a]
findCycle rep next initial = findMaybe (findCycleTree rep []) forest
where
forest = dfs rep next initial
findCycleTree :: Eq r => (a -> r) -> [a] -> DfsTree a -> Maybe [a]
findCycleTree rep ancestors (Tree a branches) = findMaybe (findCycleTree rep (a : ancestors)) branches
findCycleTree rep ancestors (BackEdge a) = Just (a : reverse (takeWhile (\a' -> rep a /= rep a') ancestors))
findCycleTree _ _ (CrossEdge _) = Nothing
findMaybe :: (a -> Maybe b) -> [a] -> Maybe b
findMaybe f = foldr ((<|>) . f) Nothing
exampleCycle :: Maybe [Int]
exampleCycle = findCycle rep next initial
where
rep = id
next 0 = [1]
next 1 = []
next 2 = [3,5]
next 3 = [4]
next 4 = []
next 5 = [6,7]
next 6 = [2]
next _ = []
initial = [0,2]
-- Finding connected components
-- Assumes undirected graph: x ∈ next y <-> y ∈ next x
connected :: Ord r => (a -> r) -> (a -> [a]) -> [a] -> [[a]]
connected rep next initial = [c | c@(_ : _) <- collectComponent <$> forest]
where
forest = dfs rep next initial
collectComponent (Tree a branches) = a : (branches >>= collectComponent)
collectComponent (BackEdge _) = []
collectComponent (CrossEdge _) = []
exampleConnected :: [[Int]]
exampleConnected = connected rep next initial
where
rep = id
next 0 = [1]
next 1 = [0]
next 2 = [3,4]
next 3 = [2,4]
next 4 = [2,3]
next 5 = [6]
next 6 = [5]
next _ = []
initial = [0,2,5]
main :: IO ()
main = do
print exampleCycle -- cycle: [2,5,6]
print exampleConnected -- connected components: [[0,1],[2,3,4],[5,6]]
```

- How can I refactor from error to an ExceptT?
- How can I generate a random sequence of elements from a list in Haskell?
- Problem with quantified types and pattern matching
- Why is the `unicode-show` package necessary?
- Get sum of int or integer in Haskell
- Java 8: streams and the Sieve of Eratosthenes
- Can iterate be written with a fold?
- Haskell function with data pattern match and second argument gives Equations have different numbers of arguments
- How to use (->) instances of Monad and confusion about (->)
- Trouble understanding Haskell type unification with a nested `fmap`
- why ghc does not support PIE and Full RelRO in linux?
- Controlling Wai logger messages
- How do Haskell compilers decide whether to allocate on the heap or the stack?
- How can date/time format of Yesod logger be configured?
- In Haskell's Turtle library, how to copy a file, but preserve the file date
- Could not deduce (Semigroup (TimedEvents a))
- Instance of class with Data family yielding error "Couldn't match expected type"
- Is there a way to get a Haskell setup on Windows without an installation? (Copy + paste)
- Haskell parse error on input `<-'
- Inline assembly in Haskell
- How do I deal with arbitrary length tuple to build up a complex SQL query for Haskell's postgresql-simple's query function?
- Extract liftIO and runSql in separate function (Haskell)
- How to query NodeJS stream 'meta data'?
- Building multiple executables in the default Haskell Stack project
- Take elements of a list up to and including the first value that satisfies some predicate in Haskell
- How are Hackage package names mapped to 'cabal install' names?
- Insert entity into DB with manually created foreign key in Persistent library (Haskell)
- couldn´t match type 'IO´ with ´[]` inside a do
- Why doesn't contravariance apply in certain cases like [b] → Int < a → Int
- How does the Haskell compiler handle the 'where' statement?