prologclpfdclpb

CLP(FD): solution number count speedup


I'm working with the following SWI-Prolog version: SWI-Prolog version 9.1.2 for x86_64-linux.

I'm trying to count the number of valid solutions using Prolog and the CLP(FD) module. Currently, to count the number of valid solutions, I'm using the findall/3 predicate in the following way:

findall(1, labeling([max, up, enum], Flat), CountList),
length(CountList, Count),

This works, but with a larger number of solutions (> 1.000.000) it starts getting slow. I imagine that's because what I'm essentially doing is generating all of the solutions instead of reasoning it from the valid domain of my variables. I assume that creating a list with million elements in it doesn't help either.

Is there a way of only counting the number of possible solutions to N variables in CLP(FD) without generating them?

I've tried using the FD set predicates: fd_set/2 and fdset_size/2, and multiply the FD set sizes of all variables. The problem is that the FD set size I get don't account for FD intersection, so the solution count is too high.

I've also tried using the sat_count/2 predicate from the CLP(B) module, but I was unable to use it properly, since it reasons over boolean variables, while my variables are integers, so from this code:

sat_count(+[1|Variables], Count),

I get the following error message:

ERROR: Domain error: `clpb_expr' expected, found `+[1,0,0,2,2,4,6,8,8,10,10,12,14,16,16,18,18,20,22,24,24,26,26,28,30,32,32,34,34,36,38]'

So I assume I should convert every variable to a boolean expression, so that the sum of the sat count of those expressions equals to the number of valid solutions I get by using labeling/2, but I'm not sure how.


Solution

  • As a first fix, consider to use an efficient implementation of countall/2. Currently, good implementations are available in GNU, Scryer, and Trealla. This alone will give you some of the speed up you are asking for.

    ?- countall(( X in 1..7, Y in 5..9, labeling([],[X,Y]) ), T).
       T = 35.
    

    However, this is only an incremental improvement minimizing space consumption and increasing speed by some constant factor only. A better approach would be to use the labeling option upto_in/1 which avoids labeling out slackish variables. Instead, it unifies its argument with the number of solutions like so:

    ?- X in 1..7, Y in 5..9, labeling([upto_in(S)],[X,Y]).
       S = 35, clpz:(X in 1..7), clpz:(Y in 5..9).
    ?- countall(( X in 1..7, Y in 5..9, X #< Y, labeling([],[X,Y]) ), T).
       T = 29.
    ?- X in 1..7, Y in 5..9, X #< Y, labeling([upto_in(S)],[X,Y]).
       X = 1, S = 5, clpz:(Y in 5..9)
    ;  X = 2, S = 5, clpz:(Y in 5..9)
    ;  X = 3, S = 5, clpz:(Y in 5..9)
    ;  ... .
    ?- findall(S, ( X in 1..7, Y in 5..9, X #< Y, labeling([upto_in(S)],[X,Y]) ), Ss), sum_list(Ss, T).
       Ss = [5,5,5,5,4,3,2], T = 29.
    

    In case you are interested going into such details even further, consider to use Scryer, as the author of SWI's library(clpfd) has moved to Scryer with library(clpz).

    There might be even further progress, when considering the answer above:

    ?- X in 1..7, Y in 5..9, X #< Y, labeling([upto_in(S)],[X,Y]).
       X = 1, S = 5, clpz:(Y in 5..9)
    ;  X = 2, S = 5, clpz:(Y in 5..9)
    ;  X = 3, S = 5, clpz:(Y in 5..9)
    ;  X = 4, S = 5, clpz:(Y in 5..9)
    ;  X = 5, S = 4, clpz:(Y in 6..9)
    ;  X = 6, S = 3, clpz:(Y in 7..9)
    ;  X = 7, S = 2, clpz:(Y in 8..9).
    

    A better way might be to collapse the first 4 answers:

    ?- X in 1..7, Y in 5..9, X #< Y, labeling([betterupto_in(S)],[X,Y]).
       S = 20, clpz:(X in 1..4), clpz:(Y in 5..9) % not implemented
    ;  X = 5, S = 4, clpz:(Y in 6..9)
    ;  X = 6, S = 3, clpz:(Y in 7..9)
    ;  X = 7, S = 2, clpz:(Y in 8..9).
    

    Or simply reorder the variables for labeling:

    ?- X in 1..7, Y in 5..9, X #< Y, labeling([upto_in(S)],[Y,X]).
       Y = 5, S = 4, clpz:(X in 1..4)
    ;  Y = 6, S = 5, clpz:(X in 1..5)
    ;  Y = 7, S = 6, clpz:(X in 1..6)
    ;  S = 14, clpz:(X in 1..7), clpz:(Y in 8..9).