I am learning Arrow
following the tutorial programming with arrows. I've typed the following code according to the paper except that the SF
is defined by data
, not by newtype
as in the paper (actually, I made this change by chance, since I typed the code from memory):
import Control.Category
import Control.Arrow
import Prelude hiding (id, (.))
data SF a b = SF { runSF :: [a] -> [b] } -- this is the change, using data instead of newtype as in the paper
-- The folowing code is the same as in the paper
instance Category SF where
id = SF $ \x -> x
(SF f) . (SF g) = SF $ \x -> f (g x)
instance Arrow SF where
arr f = SF $ map f
first (SF f) = SF $ unzip >>> first f >>> uncurry zip
instance ArrowChoice SF where
left (SF f) = SF $ \xs -> combine xs (f [y | Left y <- xs])
where
combine (Left _ : ys) (z:zs) = Left z : combine ys zs
combine (Right y : ys) zs = Right y : combine ys zs
combine [] _ = []
delay :: a -> SF a a
delay x = SF $ init . (x:)
mapA :: ArrowChoice a => a b c -> a [b] [c]
mapA f = arr listcase >>>
arr (const []) ||| (f *** mapA f >>> arr (uncurry (:)))
listcase :: [a] -> Either () (a, [a])
listcase [] = Left ()
listcase (x:xs) = Right (x, xs)
When I load the file in ghci
and execute runSF (mapA (delay 0)) [[1,2,3],[4,5,6]]
, it triggers an infinit loop and runs out of memory finally. If I change data
back to newtype
, everything is OK. The same problem happens in ghc 8.0.2, 8.2.2 and 8.6.3.
The same problem also exists even I compile the code into an executable.
I have thought the difference between data
and newtype
, when defining a data structure with only one field, is the runtime cost. But this problem seems to imply more difference between them. Or there may be something that I haven't noticed about the Arrow
type-class.
Can anyone have any ideas? Thanks very much!
Let's look at this example.
data A = A [Int]
deriving (Show)
cons :: Int -> A -> A
cons x (A xs) = A (x:xs)
ones :: A
ones = cons 1 ones
We would expect that ones
should be A [1,1,1,1...]
, because all we have done is wrap a list in a data
constructor. But we would be wrong. Recall that pattern matches are strict for data
constructors. That is, cons 1 undefined = undefined
rather than A (1 : undefined)
. So when we try to evaluate ones
, cons
pattern matches on its second argument, which causes us to evaluate ones
... we have a problem.
newtype
s don't do this. At runtime newtype
constructors are invisible, so it's as if we had written the equivalent program on plain lists
cons :: Int -> [Int] -> [Int]
cons x ys = x:ys
ones = cons 1 ones
which is perfectly productive, since when we try to evaluate ones
, there is a :
constructor between us and the next evaluation of ones
.
You can get back the newtype
semantics by making your data constructor pattern matches lazy:
cons x ~(A xs) = A (x:xs)
This is the problem with your code (I have run into this exact problem doing this exact thing). There are a few reasons data
pattern matches are strict by default; the most compelling one I see is that pattern matching would otherwise be impossible if the type had more than one constructor. There is also a small runtime overhead to lazy pattern matching in order to fix some subtle GC leaks; details linked in the comments.