Haskell's hxt fails if I add another line a function

I am trying to parse a COLLADA file with Haskell's hxt package.

I have been doing fine, but I have run into an odd bug (or more likely, an error on my part).

I have an arrow that looks like this:

processGeometry = proc x -> do
    geometry <- atTag "geometry" -< x
    meshID <- getAttrValue "id" -< geometry
    meshName <- getAttrValue "name" -< geometry
    mesh <- atTag "mesh" -< geometry
    sources <- hasName "source" <<< getChildren -< mesh
    positionSource <- hasAttrValue "id" ("-positions" `isSuffixOf`) -< sources
    positionArray  <- processFloatSource -< positionSource
    returnA -< positionArray

Adding the line

normalSource <- hasAttrValue "id" ("-normals" `isSuffixOf`) -< sources

near the bottom, however, makes the entire arrow fail.

This happens no matter what I return, even if I am returning the original x.

Here is my atTag function:

atTag tag = deep (isElem >>> hasName tag)

And here is my sample COLLADA file I am trying to parse: https://pastebin.com/mDSTH2TW

Why does adding a line change the outcome of the arrow completely, when it shouldn't do anything at all?

Solution

TL;DR: if you're looking for two separate child elements, use separate calls to getChildren.

Your variable sources doesn't represent the list of all source elements. Instead it's a single source. If you check the type of sources, you'll see it's XMLTree. So when you use hasAttrValue on it twice, you're looking for a single source element that matches both cases.

As for why it doesn't matter what you return: each line is executed even if its value isn't used. In fact unless you're using the output, you don't even have to assign it a name: a line of only hasAttrValue "id" (isSuffixOf "-normals") <- sources (removing normalSource <-) works just the same. So if you return x, it still returns x only when it can find that impossible source element.

You can get your code to find both separate source elements by doing separate two calls to getChildren -- one per separate element you're looking for -- and checking the "id" attribute of each one separately.

Here's a self-contained example if the above is unclear.

data Tree a = Tree a [Tree a]

exampleTree :: Tree String
exampleTree = Tree "root" [Tree "childA" [], Tree "childB" []]

newtype ListArrow a b = ListArrow { runListArrow :: a -> [b] }

instance Category ListArrow where
    id = ListArrow (\x -> [x])
    (ListArrow g) . (ListArrow f) = ListArrow (\x -> concatMap g (f x))

instance Arrow ListArrow where
    arr f = ListArrow (\x -> [f x])
    first (ListArrow f) = ListArrow (\(a, b) -> [ (a', b) | a' <- f a ])

getChildren :: ListArrow (Tree a) (Tree a)
getChildren = ListArrow gc where
    gc (Tree _ children) = children

hasContent :: Eq a => a -> ListArrow (Tree a) (Tree a)
hasContent content = ListArrow hc where
    hc cur@(Tree c _) = if content == c then [cur] else []

getContent :: ListArrow (Tree a) a
getContent = ListArrow gc where
    gc (Tree c _) = [c]

-- this has the same problem as the code in the question
findBothChildrenBad :: ListArrow (Tree String) (String, String)
findBothChildrenBad = proc root -> do
    -- child is a (single) child of the root
    child <- getChildren -< root

    -- childA == child, and filter to only cases where its content is "childA"
    childA <- hasContent "childA" -< child

    -- childB == child, and filter to only cases where its content is "childB"
    childB <- hasContent "childB" -< child
    -- now the content has to be both "childA" and "childB" -- so we're stuck

    childAContent <- getContent -< childA
    childBContent <- getContent -< childB
    returnA -< (childAContent, childBContent)

-- this is the fixed version
findBothChildren :: ListArrow (Tree String) (String, String)
findBothChildren = proc root -> do
    -- childA is a (single) child of the root
    childA <- getChildren -< root

    -- filter to only cases where its content is "childA"
    hasContent "childA" -< childA

    -- childB is a (potentially different) child of the root
    childB <- getChildren -< root

    -- filter to only cases where its content is "childB"
    hasContent "childB" -< childB
    -- we're not stuck here

    childAContent <- getContent -< childA
    childBContent <- getContent -< childB
    returnA -< (childAContent, childBContent)