haskellcachinglazy-evaluationfrege

Does Haskell/Frege ever recalcuate elements of a lazy list?


Suppose I have a list of all primes, defined as

primes :: (Enum α, Integral α) => [α]
primes = sieve [2..]
  where sieve :: (Integral α) => [α] -> [α]
        sieve []     = undefined
        sieve (x:xs) = x : (sieve $ filter ((/= 0) . (flip mod x)) xs)

and I want to feed primes through multiple, different functions like:

sumOfPrimesLessThan :: (Integral α) => α -> α
sumOfPrimesLessThan n = sum $ takeWhile (< n) primes

or

productOfPrimesLessThan :: (Integral α) => α -> α
productOfPrimesLessThan n = foldl (*) 1 $ takeWhile (< n) primes

or something, as if by doing

main = do print (sumOfPrimesLessThan     1000 :: Integer)
          print (productOfPrimesLessThan 1000 :: Integer)

Would Haskell, at any point in time, recalculate primes, or any part thereof? What would cached? What wouldn't be cached?

Addendum 0: Suppose I had another function

prime = flip elem primes

If prime were to be evaluated with different arguments, would each evaluation re-evaluate primes? For example:

allPrime = all prime

Solution

  • In your case (for Haskell GHC) the answer is that primes is recalculated. Yet, if you had a monomorphic signature like primes :: [Int], that wouldn't be the case. Here's a way to debug this: import Debug.Trace and add the trace function to primes

    primes :: (Enum α, Integral α) => [α]
    primes = trace "call to primes" $ sieve [2..]
      where sieve :: (Integral α) => [α] -> [α]
            sieve []     = undefined
            sieve (x:xs) = x : (sieve $ filter ((/= 0) . (flip mod x)) xs)
    

    Now, every time primes is called, you will see "call to primes" printed out. Compiling your program (with or without optimizations), I get two calls to primes.

    But why?

    Haskell actually compiles your version of primes to a function that takes one type argument, and so the use of primes inside sumOfPrimesLessThan and productOfPrimesLessThan is actually a function call (and function calls are not in general memoized in Haskell, for pretty obvious reasons). When you call sumOfPrimesLessThan 1000 :: Integer, for example, you actually give it two arguments: the value 1000 and the type argument Integer. sumOfPrimesLessThan then passes this second argument on to primes.

    On the other hand, if you had type signatures that were monomorphic, like primes :: [Int], sumOfPrimesLessThan :: Int -> Int, and productOfPrimesLessThan :: Int -> Int, Haskell compiles primes down to just a regular thunk value, so it only gets evaluated once.

    Here is another explanation of how Haskell represents values and functions on the inside I gave not long ago.

    SPECIALIZE pragma (specific to GHC)

    Sometimes you would like to be able to tell GHC that even if your expression is polymorphic, you'd like it to be treated as monomorphic for a couple types. (So you kind of wish you had a second version primes :: [Integer] even if in general primes :: (Enum α, Integral α) => [α].) You can specify this using the SPECIALIZE pragma:

    {-# SPECIALIZE primes :: [Integer] #-}
    primes :: (Enum a,Integral a) => [a]
    primes = ...
    

    And now, just for the Integer type, primes will only be computed once. For other types (like Int), we will still get the same behavior as before.

    In response to addendums

    For multiple calls to prime = flip elem primes where prime is monomorphic ("instantiated"), you would still have only one call to primes. Once primes is monomorphic, it can be shared everywhere. Also, beware the monomorphism restriction in your allPrime = all prime example - you'll need to specify which instance of Foldable want (allPrime = all prime is subtly different from allPrime xs = all prime xs).