prologswi-prologgnu-prolog

Up-to-date Prolog implementation benchmarks?


Are there any up-to-date Prolog implementation benchmarks (with results)?

I found this on the web site. Surprisingly, it shows a 20-fold gap between and Aquarius. I suspect that these results are pretty old. Does this gap still hold? Personally, I'd also like to see some comparisons with the occurs check turned on, since it has a major impact on performance, and some compilers might be better than others at optimizing it away.

Of the more recent comparisons, I found this claim that is 2x faster than SWI, and YAP is 4x faster than SWI on one specific code base.


Edit:

a specific case where the occurs check is needed for a real world problem

Sure: type inference in Haskell, OCaml, Swift or theorem provers such as this one. I also think the burden is on the programmer to prove that his code doesn't need the occurs check. Tests can only prove that you do need it, not that you don't need it.


Solution

  • There's also a classical set of Prolog benchmarks that can be used for comparing Prolog implementations. Some Prolog systems include them (e.g. SWI-Prolog). They are also included in the Logtalk distribution, which allows running them with the supported backends:

    https://github.com/LogtalkDotOrg/logtalk3/tree/master/examples/bench

    In the current Logtalk git version, you can start it with the backend you want to benchmark and use the queries:

    ?- {bench(loader)}.
    ...
    ?- run.
    

    These will run each benchmark 1000 times are reported the total time. Use run/1 for a different number of repetitions. For example, in my macOS system using SWI-Prolog 8.3.15 I get:

    ?- run.
    boyer: 20.897818 seconds
    chat_parser: 7.962188999999999 seconds
    crypt: 0.14653999999999812 seconds
    derive: 0.004462999999997663 seconds
    divide10: 0.002300000000001745 seconds
    log10: 0.0011489999999980682 seconds
    meta_qsort: 0.2729539999999986 seconds
    mu: 0.04534600000000211 seconds
    nreverse: 0.016964000000001533 seconds
    ops8: 0.0016230000000021505 seconds
    poly_10: 1.9540520000000008 seconds
    prover: 0.05286200000000463 seconds
    qsort: 0.030829000000004214 seconds
    queens_8: 2.2245050000000077 seconds
    query: 0.11675499999999772 seconds
    reducer: 0.00044199999999960937 seconds
    sendmore: 3.048624999999994 seconds
    serialise: 0.0003770000000073992 seconds
    simple_analyzer: 0.8428750000000065 seconds
    tak: 5.495768999999996 seconds
    times10: 0.0019139999999993051 seconds
    unify: 0.11229400000000567 seconds
    zebra: 1.595203000000005 seconds
    browse: 31.000829000000003 seconds
    fast_mu: 0.04102400000000728 seconds
    flatten: 0.028527999999994336 seconds
    nand: 0.9632950000000022 seconds
    perfect: 0.36678499999999303 seconds
    true.
    

    For SICStus Prolog 4.6.0 I get:

    | ?- run.
    boyer: 3.638 seconds
    chat_parser: 0.7650000000000006 seconds
    crypt: 0.029000000000000803 seconds
    derive: 0.0009999999999994458 seconds
    divide10: 0.001000000000000334 seconds
    log10: 0.0009999999999994458 seconds
    meta_qsort: 0.025000000000000355 seconds
    mu: 0.004999999999999893 seconds
    nreverse: 0.0019999999999997797 seconds
    ops8: 0.001000000000000334 seconds
    poly_10: 0.20500000000000007 seconds
    prover: 0.005999999999999339 seconds
    qsort: 0.0030000000000001137 seconds
    queens_8: 0.2549999999999999 seconds
    query: 0.024999999999999467 seconds
    reducer: 0.001000000000000334 seconds
    sendmore: 0.6079999999999997 seconds
    serialise: 0.0019999999999997797 seconds
    simple_analyzer: 0.09299999999999997 seconds
    tak: 0.5869999999999997 seconds
    times10: 0.001000000000000334 seconds
    unify: 0.013000000000000789 seconds
    zebra: 0.33999999999999986 seconds
    browse: 4.137 seconds
    fast_mu: 0.0070000000000014495 seconds
    nand: 0.1280000000000001 seconds
    perfect: 0.07199999999999918 seconds
    yes
    

    For GNU Prolog 1.4.5, I use the sample embedding script in logtalk3/scripts/embedding/gprolog to create an executable that includes the bench example fully compiled:

    | ?- run.
    boyer: 9.3459999999999983 seconds
    chat_parser: 1.9610000000000003 seconds
    crypt: 0.048000000000000043 seconds
    derive: 0.0020000000000006679 seconds
    divide10: 0.00099999999999944578 seconds
    log10: 0.00099999999999944578 seconds
    meta_qsort: 0.099000000000000199 seconds
    mu: 0.012999999999999901 seconds
    nreverse: 0.0060000000000002274 seconds
    ops8: 0.00099999999999944578 seconds
    poly_10: 0.72000000000000064 seconds
    prover: 0.016000000000000014 seconds
    qsort: 0.0080000000000008953 seconds
    queens_8: 0.68599999999999994 seconds
    query: 0.041999999999999815 seconds
    reducer: 0.0 seconds
    sendmore: 1.1070000000000011 seconds
    serialise: 0.0060000000000002274 seconds
    simple_analyzer: 0.25 seconds
    tak: 1.3899999999999988 seconds
    times10: 0.0010000000000012221 seconds
    unify: 0.089999999999999858 seconds
    zebra: 0.63499999999999979 seconds
    browse: 10.923999999999999 seconds
    fast_mu: 0.015000000000000568 seconds
    
    (27352 ms) yes
    

    I suggest you try these benchmarks, running them on your computer, with the Prolog systems that you want to compare. In doing that, remember that this is a limited set of benchmarks, not necessarily reflecting the actual relative performance in non-trivial applications.


    Ratios:

                  SICStus/SWI   GNU/SWI
    
    boyer               17.4%     44.7%
    browse              13.3%     35.2%
    chat_parser          9.6%     24.6%
    crypt               19.8%     32.8%
    derive              22.4%     44.8%
    divide10            43.5%     43.5%
    fast_mu             17.1%     36.6%
    flatten                 -         -
    log10               87.0%     87.0%
    meta_qsort           9.2%     36.3%
    mu                  11.0%     28.7%
    nand                13.3%         -
    nreverse            11.8%     35.4%
    ops8                61.6%     61.6%
    perfect             19.6%         -
    poly_10             10.5%     36.8%
    prover              11.4%     30.3%
    qsort                9.7%     25.9%
    queens_8            11.5%     30.8%
    query               21.4%     36.0%
    reducer            226.2%      0.0%
    sendmore            19.9%     36.3%
    serialise          530.5%   1591.5%
    simple_analyzer     11.0%     29.7%
    tak                 10.7%     25.3%
    times10             52.2%     52.2%
    unify               11.6%     80.1%
    zebra               21.3%     39.8%
    

    P.S. Be sure to use Logtalk 3.43.0 or later as it includes portability fixes for the bench example, including for GNU Prolog, and a set of basic unit tests.