As we have learned from my previous
question,
+=
is the operator that allows to add one element at a time. Is it possible to
«detect» previously added elements and control how future additions are
done?
Here is a simple program to start our investigation:
module Main (main) where
import Control.Monad (void)
import Text.XML.HXT.Core
main :: IO ()
main = void $ runX $ root [] [foo]
>>> writeDocument [withIndent yes] "test.xml"
foo :: ArrowXml a => a XmlTree XmlTree
foo = eelem "foo" += bar += bar += bar -- += is left associative
bar :: ArrowXml a => a XmlTree XmlTree
bar = ifA (deep (hasName "bar")) (eelem "baz") (eelem "bar")
Here, foo
creates ‘foo’ node and its contents. Contents are generated with
bar
arrow, that's supposed to be smart enough to detect previously added
‘bar’ element, and change its behavior. Here we use deep
for simplicity:
if ‘bar’ element is child of ‘foo’ element, it should be detected, no matter
how deep it is (getChildren >>> hasName "bar"
should do the trick too).
So, expected contents of test.xml
file are:
<?xml version="1.0" encoding="UTF-8"?>
<foo>
<bar/>
<baz/>
<baz/>
</foo>
It doesn't work, of course. Here is what I get:
<?xml version="1.0" encoding="UTF-8"?>
<foo>
<bar/>
<bar/>
<bar/>
</foo>
My questions:
Why ‘bar’ element cannot be detected by bar
arrow?
How to detect it?
This is one of those cases where type signatures really help. Stare for a second at the type signature:
(+=) :: (ArrowXml a) => a b XmlTree -> a b XmlTree -> a b XmlTree
First off, ArrowXml
is a subclass of Arrow
, which describes some sort of machinery taking some input to some output. You can think of it like a big factory with conveyor belts taking things to different machines, and we're building these factory machines and hence factories with functions. Three of the Arrow combinators, for example, are:
(&&&) :: (Arrow a) => a b c -> a b c' -> a b (c, c') |infixr 3|
Fanout: send the input to both argument arrows and combine their output.
arr :: (Arrow a) => (b -> c) -> a b c
Lift a function to an arrow.
(.) :: (Category cat) => cat b c -> cat a b -> cat a c
morphism composition.
Now look at the lowercase letter (type variable) very closely in:
(+=) :: (ArrowXml a) => a b XmlTree -> a b XmlTree -> a b XmlTree
Clearly we're taking two machines which turn b
s into XmlTree
s and "merging them together" into one machine which takes a b
in and expels an XmlTree
. But importantly, this type signature tells us that more or less the only way that this can be implemented is:
arr1 += arr2 = arr f . (arr1 &&& arr2) where
f :: (XmlTree, XmlTree) -> XmlTree
f = _
This is because of "free theorems"; if you don't know the type of a parameter then we can prove that you can't really do much with it. (It can be a little more complicated because the arrow might have structures which aren't totally encapsulated by arr
, like internal counters which are summed together with .
which are then set to 0
when using arr
. So actually replace arr f
with a generic a (XmlTree, XmlTree) XmlTree
and you're good to go.)
So we must have parallel execution of the two arrows here. That's what I'm trying to say. Because the combinator (+=)
doesn't know what b
is, it has no choice but to blithely feed the b
to the arrows in parallel and then try to combine their outputs together. So, deep (hasName "bar")
does not look at foo
.
You can possibly make a mutually recursive solution with deep (hasName "bar") . foo
if you really want, but it seems potentially dangerous (i.e. infinite loop), so it may be safer to simply define something like:
a ++= b = a += (b . a)
where the "current" a is "fed" to b to produce the update. To do this you will have to import .
from Control.Category
as it is not the same as Prelude..
(which does function composition only). This looks like:
import Prelude hiding ((.))
import Control.Category ((.))