haskellrecursiontree

Handling starred and non-starred nodes in a tree transformation


Problem:

Starred Nodes:

Requirement:

Starting from this tree,

Node
  "Root"
  [ Node
      "Alpha"
      [ Node "Beta*" -- A starred node
          [ Node "Gamma" [], -- Unstarred child of a starred node
            Node "Delta*" -- Starred child of another starred node
              [ Node "Epsilon" [] ] -- Unstarred child of "Delta*"
          ],
        Node "Zeta" [] -- Unstarred node
      ],
    Node
      "Theta"
      [ Node "Iota" [],
        Node "Kappa*" [] -- Starred node that does not descend from other starred nodes
      ]
  ]

I would like to produce:

Node
  "Root"
  [ Node
      "Beta*" -- A starred node
      [ Node "Delta*" -- Starred child of another starred node
          [] 
      ],
    Node "Kappa*" [] -- Starred node that does not descend from other starred nodes
  ]

Solution

  • Here's one straightforward answer. First we'll write a helper function; it will return the largest starred subtree starting at the current node -- if there is one! -- together with the collection of other starred subtrees from deeper in the tree.

    helper :: Tree String -> (Forest String, Forest String)
    helper (Node label children)
      | hasStar label = ([Node label starredChildren], otherStarredSubtrees)
      | otherwise = ([], starredChildren ++ otherStarredSubtrees)
      where
      rec = map helper children
      starredChildren = rec >>= fst
      otherStarredSubtrees = rec >>= snd
    

    You could use this function directly. For example:

    > helper example
    ([],[Node {rootLabel = "Beta*", subForest = [Node {rootLabel = "Delta*", subForest = []}]},Node {rootLabel = "Kappa*", subForest = []}])
    

    But it's probably convenient to consolidate the two halves of the output at the top level, since the root may have been starred and you want to include that in the answer. You can either do this directly, combining the two outputs, or delegate to the helper, making up a fresh unstarred node above the root to avoid that special case.

    handle :: Tree String -> Forest String
    handle = uncurry (++) . helper -- direct
    handle = snd . helper . Node "" . (:[]) -- delegate
    

    Now we don't get the spurious empty list in the output.

    > handle example
    [Node {rootLabel = "Beta*", subForest = [Node {rootLabel = "Delta*", subForest = []}]},Node {rootLabel = "Kappa*", subForest = []}]