haskellarrow-abstraction

Arrows confusion when trying to use proc and do notation


I have been trying to write a more compact version of some Haskell code that uses arrows.

I am trying to convert the XML to a list of tuples.

Running tx2 produces: [("Item 1","Item One",["p1_1","p1_2","p1_3"]),("Item 2","Item Two",["p2_1","p2_2"])]

The code that I have works but I cannot help thinking that I should not have to use as many runLA calls as I do. I call runLA for each of getDesc, getDisp, and getPlist.

I thought I might be able to use proc and do notation to simplify

{-# LANGUAGE Arrows, NoMonomorphismRestriction #-}
module Test1 where

import Text.XML.HXT.Arrow.ReadDocument
import Text.XML.HXT.Core

xml = "<top>\
               \<list>\
            \<item>\
                \<desc>Item 1</desc>\
                \<plist>\
                    \<p>p1_1</p>\
                    \<p>p1_2</p>\
                    \<p>p1_3</p>\
                \</plist>\
                \<display>Item One</display>\
            \</item>\
            \<item>\
                \<desc>Item 2</desc>\
                \<plist>\
                    \<p>p2_1</p>\
                    \<p>p2_2</p>\
                \</plist>\
                \<display>Item Two</display>\
            \</item>\
        \</list>\
    \</top>"

tx1 = runLA (xread >>> getChildren >>> hasName "list" >>> getChildren >>> hasName "item") xml
tx2 = map toTuple tx1

toTuple i = let
            desc = getDesc i
            display = getDisp i
            plist = getPlist i
            in (desc, display, plist)

aDesc = getChildren >>> hasName "desc" >>> getChildren >>> getText >>> unlistA
aDisp = getChildren >>> hasName "display" >>> getChildren >>> getText >>> unlistA
aPlist = getChildren >>> hasName "plist" >>> getChildren >>> deep getText

getDesc i = runLA aDesc i
getDisp i = runLA aDisp i
getPlist i = runLA aPlist i

But when I try to rewrite tx2 as follows:

aToTuple = proc tree -> do
                desc    <-  aDesc  -< tree
            display <-  aDisp -< tree
            plist   <- aPlist -< tree
            returnA -< (desc, display, plist)

tx3 = map (\i -> runLA aToTuple i) tx1

It all falls in a big heap.

What am I missing with the conversion to proc/do notation?

Thanks.


Solution

  • You should almost never have to call a run-function several times on HXT arrow to get the result you want. In your case, listA can be used instead of map runLA to get a list of results from an arrow. You can also get rid of many of the getChildren calls by using the /> operator.

    Your proc-version of toTuple looks fine to me, but I'd rewrite the rest of your example code as

    tx1 = runLA (xread /> hasName "list" /> hasName "item" >>> toTuple) xml
    
    toTuple = proc tree -> do
        desc <- aDesc -< tree
        disp <- aDisp -< tree
        plist <- aPlist -< tree
        returnA -< (desc, disp, plist)
    
    
    aDesc  = getChildren >>> hasName "desc" /> getText
    aDisp  = getChildren >>> hasName "display" /> getText
    aPlist = getChildren >>> hasName "plist" >>> listA (getChildren /> getText)
    

    And instead of using arrow notation, toTuple can be written simply as

    toTuple = aDesc &&& aDisp &&& aPlist >>> arr3 (,,)