haskelllazy-evaluationmonoidssemigroup

Keeping IO lazy under append


I may have been under the false impression that Haskell is lazier than it is, but I wonder if there's a way to get the best of both worlds...

Data.Monoid and Data.Semigroup define two variations of First. The monoidal version models the leftmost, non-empty value, whereas the semigroup version simply models the leftmost value.

This works fine for pure value values, but consider impure values:

x = putStrLn "x" >> return 42
y = putStrLn "y" >> return 1337

Both of these values have the type Num a => IO a. IO a is a Semigroup instance when a is:

instance Semigroup a => Semigroup (IO a)
  -- Defined in `Data.Orphans'

This means that it's possible to combine two IO (First a) values:

Prelude Data.Semigroup Data.Orphans> fmap First x <> fmap First y
x
y
First {getFirst = 42}

As we can see, though, both x and y produce their respective side-effects, even though y is never required.

The same applies for Data.Monoid:

Prelude Data.Monoid> fmap (First . Just) x <> fmap (First . Just) y
x
y
First {getFirst = Just 42}

I think I understand why this happens, given that both the Semigroup and Monoid instances use liftA2, which seems to ultimately be based on IO bind, which is strict, as far as I understand it.

If I dispense with the First abstraction(s), however, I can get lazier evaluation:

first x _ = x

mfirst x y = do
  x' <- x
  case x' of
    (Just _) -> return x'
    Nothing -> y

Using both of those ignores y:

Prelude> first x y
x
42
Prelude> mfirst (fmap Just x) (fmap Just y)
x
Just 42

In both of these cases, y isn't printed.

My question is, then:

Can I get the best of both worlds? Is there a way that I can retain the Semigroup or Monoid abstraction, while still get lazy IO?

Is there, for example, some sort of LazyIO container that I can wrap First values in, so that I get the lazy IO I'd like to have?

The actual scenario I'm after is that I'd like to query a prioritised list of IO resources for data, and use the first one that gives me a useful response. I don't, however, want to perform redundant queries (for performance reasons).


Solution

  • The Alternative instance for the MaybeT monad transformer returns the first successful result, and does not execute the rest of the operations. In combination with the asum function, we can write something like:

    import Data.Foldable (asum)
    import Control.Applicative
    import Control.Monad.Trans.Maybe
    
    action :: Char -> IO Char
    action c = putChar c *> return c
    
    main :: IO ()
    main = do
        result <- runMaybeT $ asum $ [ empty
                                     , MaybeT $ action 'x' *> return Nothing
                                     , liftIO $ action 'v'
                                     , liftIO $ action 'z'
                                     ]
        print result
    

    where the final action 'z' won't be executed.

    We can also write a newtype wrapper with a Monoid instance which mimics the Alternative:

    newtype FirstIO a = FirstIO (MaybeT IO a)
    
    firstIO :: IO (Maybe a) -> FirstIO a
    firstIO ioma = FirstIO (MaybeT ioma)
    
    getFirstIO :: FirstIO a -> IO (Maybe a)
    getFirstIO (FirstIO (MaybeT ioma)) = ioma
    
    instance Monoid (FirstIO a) where
        mempty = FirstIO empty
        FirstIO m1 `mappend` FirstIO m2 = FirstIO $ m1 <|> m2
    

    The relationship between Alternative and Monoid is explained in this other SO question.