haskellrepahaskell-vector

Intersperse values into separate Vectors using generate


I am trying to generate a tuple of Vectors by using a function that creates a custom data type (or a tuple) of values from an index. Here is an approach that achieves the desired result:

import Prelude hiding (map, unzip)
import Data.Vector hiding (map)
import Data.Array.Repa
import Data.Functor.Identity

data Foo = Foo {fooX :: Int, fooY :: Int}

unfoo :: Foo -> (Int, Int)
unfoo (Foo x y) = (x, y)

make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make n f = unzip $ generate n getElt where
  getElt i = unfoo $ f i

Except that I would like to do it in a single iteration per Vector, almost like it is shown below, but avoiding multiple evaluation of function f:

make' :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
make' n f = (generate n getElt1, generate n getElt2) where
  getElt1 i = fooX $ f i
  getElt2 i = fooY $ f i

Just as a note, I understand that Vector library supports fusion, and the first example is already pretty efficient. I need a solution to generate concept, other libraries have very similar constructors (Repa has fromFunction for example), and I am using Vectors here simply to demonstrate a problem.

Maybe some sort of memoizing of f function call would work, but I cannot think of anything.

Edit:

Another demonstration of the problem using Repa:

makeR :: Int -> (Int -> Foo) -> (Array U DIM1 Int, Array U DIM1 Int)
makeR n f = runIdentity $ do
  let arr = fromFunction (Z :. n) (\ (Z :. i) -> unfoo $ f i)
  arr1 <- computeP $ map fst arr
  arr2 <- computeP $ map snd arr
  return (arr1, arr2)

Same as with vectors, fusion saves the day on performance, but an intermediate array arr of tuples is still required, which I am trying to avoid.

Edit 2: (3 years later)

In the Repa example above it will not create an intermediate array, since fromFunction creates a delayed array. Instead it will be even worse, it will evaluate f twice for each index, one for the first array, second time for the second array. Delayed array must be computed in order to avoid such duplication of work.


Solution

  • Looking back at my own question from a few years ago I can now easily show what I was trying to do back than and how to get it done.

    In short, it can't be done purely, therefore we need to resort to ST monad and manual mutation of two vectors, but in the end we do get this nice and pure function that creates only two vectors and does not rely on fusion.

    import Control.Monad.ST
    import Data.Vector.Primitive
    import Data.Vector.Primitive.Mutable
    
    data Foo = Foo {fooX :: Int, fooY :: Int}
    
    make :: Int -> (Int -> Foo) -> (Vector Int, Vector Int)
    make n f = runST $ do
      let n' = max 0 n
      mv1 <- new n'
      mv2 <- new n'
      let fillVectors i
            | i < n' = let Foo x y = f i
                       in write mv1 i x >> write mv2 i y >> fillVectors (i + 1)
            | otherwise = return ()
      fillVectors 0
      v1 <- unsafeFreeze mv1
      v2 <- unsafeFreeze mv2
      return (v1, v2)
    

    And the we use it in a similar fashion it is done with generate:

    λ> make 10 (\ i -> Foo (i + i) (i * i))
    ([0,2,4,6,8,10,12,14,16,18],[0,1,4,9,16,25,36,49,64,81])