scalaconcurrencyfunctional-programmingapplicativehaxl

How to use Applicative for concurrency?


This is a follow-up to my previous question. I copied the example below from Haxl

Suppose I am fetching data from a blog server to render a blog page, which contains the recent posts, popular posts and posts topics.

I have the following Data Fetching API:

val getRecent  : Server => Seq[Post] = ...
val getPopular : Server => Seq[Post] = ...
val getTopics  : Server => Seq[Topic] = ...

Now I need to compose them to implement a new function getPageData

val getPageData: Server => (Seq[Post],  Seq[Post], Seq[Topic])

Haxl suggests using a new monad Fetch to make the API composable.

val getRecent  : Fetch[Seq[Posts]] = ...
val getPopular : Fetch[Seq[Posts]] = ...
val getTopics  : Fetch[Seq[Topic]] = ...

Now I can define my getPageData: Fetch[A] with monadic composition

val getPageData = for {
  recent  <- getRecent
  popular <- getPopular
  topics  <- getTopics
} yield (recent, popular, topics)

but it does not run getRecent, getPopular, and getTopics concurrently.

Haxl suggests using applicative composition <*> to compose "concurrent" functions (i.e. the functions that can run concurrently). So my questions are:


Solution

  • How to implement getPageData assuming Fetch[A] is an Applicative ?

    All we need to do is drop the monadic bind >>= in favour of the applicative <*>. So instead of

    val getPageData = for {
      recent  <- getRecent
      popular <- getPopular
      topics  <- getTopics
    } yield (recent, popular, topics)
    

    we would write something like (in Haskell syntax; sorry, I can't do Scala off the top of my head):

    getPageData = makeTriple <$> getRecent <*> getPopular <*> getTopics
      where
        makeTriple x y z = (x, y, z)
    

    But whether this has the desired effect is contingent upon the second question!

    How to implement Fetch as an Applicative but not a Monad ?

    The key distinction between monadic and applicative sequencing is that the monadic one can depend upon the value inside a monadic value, whereas the applicative <*> cannot. Notice how the monadic expression for getPageData above binds the names recent and popular before reaching getTopics. Those names could have been used to change the structure of the expression, for example by getting some other data source in case recent is empty. But with the applicative expression, the results of getRecent and getPopular are not factors in the structure of the expression itself. This property allows us to fire off each term in the applicative expression concurrently, because we know the structure of the expression statically.

    So, using the observation above, and obviously the particular shape of the Fetch datatype, we can come up with a suitable definition for <*>. I think the following illustrates the general idea:

    data Fetch a = Fetch { runFetch :: IO a }
    
    fetchF <*> fetchX = Fetch $ do
      -- Fire off both IOs concurrently.
      resultF <- async $ runFetch fetchF
      resultX <- async $ runFetch fetchX
      -- Wait for both results to be ready.
      f <- wait resultF
      x <- wait resultX
      return $ f x
    

    For comparison, suppose we tried to do monadic bind with concurrent evaluation:

    fetchF >>= fetchK = Fetch $ do
      resultF <- async $ runFetch fetchF
      -- Oh no, we need resultF in order to produce the next
      -- Fetch value! We just have to wait...
      f <- wait resultF
      fetchX <- async $ runFetch (fetchK f)
      x <- wait $ runFetch fetchX
      return $ f x