haskellhaskell-criterion

Creating multiple Criterion Benchmarks at once


This code compiles and runs without problems:

module Main where

import Criterion.Main

main :: IO ()
main =
  defaultMain
    [env (return $ [1,2])
         (\is ->
            bgroup "group" (benchmarks is))]

timesTwo :: Int -> Int
timesTwo i = 2 * i

benchmarks :: [Int] -> [Benchmark]
benchmarks is = [ bench "foo" $ nf timesTwo (is !! 0)
                , bench "foo" $ nf timesTwo (is !! 1) ]

Yet if I change the benchmarks function to look like

benchmarks :: [Int] -> [Benchmark]
benchmarks is = map (\i -> bench "foo" $ nf timesTwo i) is

it still compiles but I get this runtime error:

ghci> main
*** Exception: Criterion atttempted to retrieve a non-existent environment!
        Perhaps you forgot to use lazy pattern matching in a function which
        constructs benchmarks from an environment?
        (see the documentation for `env` for details)

How do I resolve this?

As you can see, my goal is to map over a list from obtained from the environment in order to turn it into a list of Benchmarks that I can use with Criterion.

Note: I eventually want to use way more elements than just two, so tuples are not what I want here.


Solution

  • For benchmarking at different sizes, I usually do something like this:

    module Main (main) where
    
    import Criterion.Main
    import System.Random
    import Control.Monad
    
    import qualified Data.List
    import qualified Data.Sequence
    
    int :: Int -> IO Int
    int n = randomRIO (0,n)
    
    benchAtSize :: Int -> Benchmark
    benchAtSize n =
        env (replicateM n (int n)) $
        \xs ->
             bgroup (show n)
               [ bench "Data.List"     $ nf Data.List.sort xs
               , bench "Data.Sequence" $ nf (Data.Sequence.sort . Data.Sequence.fromList) xs
               ]
    
    main :: IO ()
    main = defaultMain (map benchAtSize [100, 1000, 10000])
    

    env is useful for ensuring that two different functions are compared on the same sample, and it's not designed to compute your whole dataset before running the benchmarks. Also, because all of the data created by env is kept in memory during the benchmarking of anything in its scope, you want to minimize it as much as is possible, to reduce overhead while benchmarking.