I'm rather new to Haskell and was looking at this post here: Cartesian product of 2 lists in Haskell.
In the answer there is this snippet of code:
cartProd xs ys = [(x,y) | x <- xs, y <- ys]
Which with these two lists:
xs = [1,2,3]
ys = [4,5,6]
would yield
[(1,4),(1,5),(1,6),(2,4),(2,5),(2,6),(3,4),(3,5),(3,6)]
Had I not seen this result I would have assumed it would just have returned
[(1,4),(2,5),(3,6)]
because it would traverse both lists at the same time.
But now it - to programming languages I know better - looks like a double for loop used to traverse a matrix:
for (int x = 1; x < 4; x++)
for(int y = 4; y < 7; y++)
//make tuple (x,y)
What causes a list comprehension to behave in this manner?
This introduction explains the syntax of list comprehension. Basically one can say that every x <- list
means an additional nested "for"-loop to generate tuples and every predicate is simply tested. Thus the expression:
[(x,y) | x <- xs, even x, y <- ys, even 3*y-div x 2]
Would be translated in an imperative language as:
for (var x : xs) {
if(even(x)) {
for(var y : ys) {
if(even(3*y-x/2)) {
yield (x,y)
}
}
}
yield
is a keyword that is sometimes used with co-routines. Furthermore as for yield
the evaluation is done lazily. This enables for instance to generate all even integers as:
[x|x <-[2..],even x]
In order to understand list comprehension fundamentally, one needs to know what Monads are. Every list comprehension can be translated into a list monad. For instance your example is translated into:
do x <- xs
(do y <- ys
return (x,y))
Which is again syntactical sugar for:
xs >>= (\x -> (ys >>= \y -> return (x,y)))
A monad is an important concept in functional programming (and probably one better reads the wikipedia page), because it is a bit hard to master. The sometimes say a monads are like burritos,....
Once you more or less understand a monad: a monad is a type-class with a return
statement and a >>=
channeling statement. Now the return
statement for the inner part is easy:
return x = [x]
So that means that each time x
and y
are set, you will create a tuple (x,y)
and return it as a singleton list: thus [(x,y)]
. Now the "bind" operator >>=
needs to "glue" ys
and \y -> return (x,y)
together. This is done by implementing it as:
(>>=) xs f = concat $ map f xs
In other words you do a mapping and concatenate the result of the mapping.
Now if take the second part of the unsugared expression into account:
ys >>= \y -> return (x,y)
It means that for a given x
(we now abstract away), we will map every element in ys
to a tuple (x,y)
and return it. We will thus generate a list of lists with every list being a singleton containing a tuple. Something like (if ys=[1,2]
):
[[(x,1)],[(x,2)]]
Now the >>=
will furthermore concat
it into:
\x -> [(x,1),(x,2)]
Until now, we've abstracted the x
away (assumed it was one). But now we can take the first part of that expression:
xs >>= \x -> [(x,1),(x,2)]
If xs=[3,5]
, it means we will create again lists:
[[(3,1),(3,2)],[(5,1),(5,2)]]
and after the concat:
[(3,1),(3,2),(5,1),(5,2)]
Which is what we expect for:
[(x,y)|x<-[3,5],y<-[1,2]]