I want to parse the following sample XML file, without the pickler module.
<?xml version="1.0" encoding="utf-8" ?>
<Groups>
<Name>ABC</Name>
<GroupA>
<Name>Foo</Name>
<Sum>100</Sum>
</GroupA>
<GroupB>
<Name>Bar</Name>
<Sum>0</Sum>
</GroupB>
</Groups>
I ended up with this:
{-# language Arrows #-}
import Text.XML.HXT.Core
data Groups = Groups GroupA GroupB deriving Show
data GroupA = GroupA String String deriving Show
data GroupB = GroupB String String deriving Show
readGroup :: LA XmlTree Groups
readGroup = deep (isElem >>> hasName "Groups") >>> getChildren >>>
proc root -> do
a <- readGroupA -< root
b <- readGroupB -< root
returnA -< Groups a b
readGroupA :: LA XmlTree GroupA
readGroupA = isElem >>> hasName "GroupA" >>> getChildren >>>
proc root -> do
n <- isElem >>> hasName "Name" /> getText -< root
s <- isElem >>> hasName "Sum" /> getText -< root
returnA -< GroupA n s
readGroupB :: LA XmlTree GroupB
readGroupB = isElem >>> hasName "GroupB" >>> getChildren >>>
proc root -> do
n <- isElem >>> hasName "Name" /> getText -< root
s <- isElem >>> hasName "Sum" /> getText -< root
returnA -< GroupB n s
Unfortunately, this does not work.
If I try to extract just a single element in a proc
context it works.
But trying to extract multiple elements will always fail\ return the empty list. I might have a misunderstanding of the composition >>>
.
I run the example with runLa (xreadDoc >>> readGroups)
Try this:
readGroup :: LA XmlTree Groups
readGroup = deep (isElem >>> hasName "Groups") >>>
proc root -> do
a <- getChildren >>> readGroupA -< root
b <- getChildren >>> readGroupB -< root
returnA -< Groups a b
readGroupA :: LA XmlTree GroupA
readGroupA = isElem >>> hasName "GroupA" >>>
proc root -> do
n <- getChildren >>> isElem >>> hasName "Name" /> getText -< root
s <- getChildren >>> isElem >>> hasName "Sum" /> getText -< root
returnA -< GroupA n s
readGroupB :: LA XmlTree GroupB
readGroupB = isElem >>> hasName "GroupB" >>>
proc root -> do
n <- getChildren >>> isElem >>> hasName "Name" /> getText -< root
s <- getChildren >>> isElem >>> hasName "Sum" /> getText -< root
returnA -< GroupB n s
When the call to getChildren
is outside the do
-block, you are committing to one child before even entering the proc
. Inside the proc
, you check (for instance) whether that child has name Name
and name Sum
. Unsurprisingly, you don't find any child fitting these contradictory requirements.
By moving getChildren
inside, you allow different children to be traversed for (for instance) n
and s
.