prologclpfdcryptarithmetic-puzzle

Cryptogram Puzzle with Prolog CLPFD


I recently found a small game on the Google Play app store called Cryptogram. There are dozens of apps similar to this one. The idea is to match the number to the colors such that all of the equations sound true.

I was able to get through problems 1-8 and problem 10 fairly quickly by hand, but problem 9 has proven to be more difficult for me.

Problem 9 Problem 9

After some time tinkering and guessing, I gave up and decided to program a solution. I have used Prolog/Datalog for some small tasks as an undergrad as well as some Project Euler problems. Previously I had seen the 15 line Sudoku solver that uses Prolog's Constraint Logic Programming over Finite Domains (clpfd) library, and I decided to give it a go myself. I'm using SWI-Prolog.

:- use_module(library(clpfd)).
problem(Colors) :-
    Colors = [Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime],
    Colors ins 0..9,
    all_distinct(Colors),
    % The leading digit of a number can't be 0
    Pink #\= 0,
    Red #\= 0,
    White #\= 0,
    Green #\= 0,
    Lime #\= 0,
    Cyan #\= 0,
    % I originally tried to write a predicate generalizing numbers and a list of digits
    % but got in way over my head with CLPFD.
    Number1_1 #= (Pink * 1000) + (Cyan * 100) + (Pink * 10) + Yellow,
    Number1_2 #= (Green * 10) + Purple,
    Number1_3 #= (Cyan * 100) + (Red * 10) + Purple,
    Number2_1 #= (Red * 1000) + (Brown * 100) + (White * 10) + Red,
    Number2_2 #= (Lime * 10) + Yellow,
    Number2_3 #= (Red * 1000) + (Lime * 100) + (Purple * 10) + Pink,
    Number3_1 #= (White * 1000) + (Purple * 100) + (Cyan * 10) + White,
    Number3_2 #= (Green * 1000) + (Cyan * 100) + (Yellow * 10) + Purple,
    Number3_3 #= (Cyan * 1000) + (Red * 100) + (Yellow * 10) + Red,
    % I'm not 100% sure whether to use floored or truncated division here.
    % I thought the difference would be a float vs integer output,
    % but that doesn't make sense with finite domains.
    Number1_1 // Number1_2 #= Number1_3,
    Number1_1 rem Number1_2 #= 0,
    Number2_3 #= Number2_1 + Number2_2,
    Number3_3 #= Number3_1 - Number3_2,
    Number3_1 #= Number1_1 - Number2_1,
    Number3_2 #= Number1_2 * Number2_2,
    Number3_3 #= Number1_3 + Number2_3.

The output when I run this query in SWI-Prolog makes me feel like I'm misunderstanding a big concept in CLPFD:

?- problem([Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime]).
Pink in 3..9,
_7756#=Pink+10*Purple+1000*Red+100*Lime,
_7810#=1010*Pink+100*Cyan+Yellow,
all_distinct([Pink, Cyan, Yellow, Green, Purple, Red, Brown, White|...]),
Cyan in 1..7,
_7946#=1000*Cyan+10*Yellow+101*Red,
_7994#=100*Cyan+10*Yellow+1000*Green+Purple,
_8048#=10*Cyan+100*Purple+1001*White,
_8096#=100*Cyan+Purple+10*Red,
Yellow in 0..9,
_8162#=Yellow+10*Lime,
Green in 1..7,
_8216#=10*Green+Purple,
Purple in 0..9,
Red in 1..7,
_8294#=1001*Red+100*Brown+10*White,
Brown in 0..9,
White in 2..8,
Lime in 1..9,
_7756 in 1103..7568,
_8096+_7756#=_7946,
_8294+_8162#=_7756,
_8096 in 110..779,
_7810//_8216#=_8096,
_7810 in 3334..9799,
_8048+_8294#=_7810,
_7810 rem _8216#=0,
_8048 in 2313..8778,
_7946+_7994#=_8048,
_7946 in 1213..7678,
_7994 in 1100..7565,
_8216*_8162#=_7994,
_8216 in 12..79,
_8162 in 14..99,
_8294 in 1021..7486.

I would expect each color in the color list to bind to a single distinct integer in the range 0..9, but that's not what's happening. Can you help me find the solution to this problem?

EDIT

So I picked an arbitrary color and started assigning it numbers in the range that the constraint says should be valid. I ran this query with Cyan bound to 1.

?- problem([Pink, 1, Yellow, Green, Purple, Red, Brown, White, Lime]).
false.

Which doesn't make sense. The previous "output" says "Cyan in 1..7", which I thought meant that any value in that range is valid. However, if I pick another arbitrary value for Cyan:

?- problem([Pink, 2, Yellow, Green, Purple, Red, Brown, White, Lime]).
Pink = 7,
Yellow = 6,
Green = 3,
Purple = 4,
Red = 1,
Brown = 8,
White = 5,
Lime = 9.

I get the answer I was looking for. Though the Cryptogram is solved, I still don't understand why Prolog's CLPFD library didn't find it completely independently.

EDIT 2

I used your suggestions to clean up the code. I also reintroduced the predicate which relates digits to numbers. This code chunk works perfectly.

:- use_module(library(clpfd)).

digit_number(0, [], 1).

digit_number(Number, [Digit|Tail], DigitPlace) :-
    digit_number(NextNumber, Tail, NextDigitPlace),
    DigitPlace #= NextDigitPlace * 10,
    PlaceNumber #= Digit * (NextDigitPlace),
    Number #= PlaceNumber + NextNumber.

digit_number(Number, ColorList) :-
    digit_number(Number, ColorList, _).

problem(Colors) :-
    Colors = [Pink, Cyan, Yellow, Green, Purple, Red, Brown, White, Lime],
    Colors ins 0..9,
    all_distinct(Colors),
    digit_number(Number1_1, [Pink, Cyan, Pink, Yellow]),
    digit_number(Number1_2, [Green, Purple]),
    digit_number(Number1_3, [Cyan, Red, Purple]),
    digit_number(Number2_1, [Red, Brown, White, Red]),
    digit_number(Number2_2, [Lime, Yellow]),
    digit_number(Number2_3, [Red, Lime, Purple, Pink]),
    digit_number(Number3_1, [White, Purple, Cyan, White]),
    digit_number(Number3_2, [Green, Cyan, Yellow, Purple]),
    digit_number(Number3_3, [Cyan, Red, Yellow, Red]),
    Number1_1 // Number1_2 #= Number1_3,
    Number1_1 rem Number1_2 #= 0,
    Number2_1 + Number2_2 #= Number2_3,
    Number3_1 - Number3_2 #= Number3_3,
    Number1_1 - Number2_1 #= Number3_1,
    Number1_2 * Number2_2 #= Number3_2,
    Number1_3 + Number2_3 #= Number3_3,
    label(Colors).

Solution

  • The other answer shows you one way of getting the result you want, but I would like to answer some of your questions.

    I still don't understand why Prolog's CLPFD library didn't find it completely independently.

    Prolog is a more-or-less declarative programming language, but (although we like to pretend, for propaganda reasons) you cannot just write down anything that is logically equivalent to your problem and expect it to be executed correctly and efficiently. In particular, the order of execution of different goals matters a lot, even though it should make no logical difference. This is especially true for arithmetic. Consider:

    ?- between(1, 99999999, N), N > 99999998.
    N = 99999999.  % correct but slooooow
    
    ?- N > 99999998, between(1, 99999999, N).
    ERROR: >/2: Arguments are not sufficiently instantiated
    

    Doing the same with CLP(FD) works much more nicely:

    ?- N in 1..99999999, N #> 99999998.
    N = 99999999.  % correct and fast!
    
    ?- N #> 99999998, N in 1..99999999.
    N = 99999999.  % also correct, also fast!
    

    CLP(FD) allows you to write programs that are more correct, more declarative, and that can often be more efficient than other solutions, unless you hand-optimize them.

    To achieve this, unlike normal Prolog, CLP(FD) separates the collection of constraints from the actual search for solutions. As your program goes along and creates constraints, CLP(FD) will make some simplifications, like in your example where it determines Cyan in 1..7 on its own, or in my example above where it can find the unique solution immediately. But in general, these simplifications do not solve the problem completely.

    One reason for this is, simply, performance: Search can be slow. It can be faster if more constraints are known, because new constraints on already constrainted variables can only make the search space smaller, but never bigger! It makes sense to delay it until concrete answers are actually needed.

    For this reason, to actually get concrete resuls, you need to call a labeling predicate that systematically enumerates solutions. In SWI-Prolog, simple ones are indomain/1 and label/1; a general one is labeling/2. This latter one even allows you to influence the search space exploration strategy, which can be useful if you have some understanding of the problem domain.

    The previous "output" says "Cyan in 1..7", which I thought meant that any value in that range is valid.

    Not quite: It means that if there is a valid solution for Cyan, then it is in the range 1 to 7. It doesn't give a guarantee that all values in that range are solutions. For example:

    ?- X in 1..5, Y in 1..5, X #< Y.
    X in 1..4,
    X#=<Y+ -1,
    Y in 2..5.
    

    3 is in the range 1..4, and 3 is in the range 2..5, so purely based on this we might expect a solution with X = 3 and Y = 3. But that is impossible due to the additional constraint. Only labeling will actually give you answers that are guaranteed solutions, and only if you label all the variables in the query.

    See also the very nice answer here: https://stackoverflow.com/a/27218564/4391743

    Edit:

    % I'm not 100% sure whether to use floored or truncated division here.
    % I thought the difference would be a float vs integer output,
    % but that doesn't make sense with finite domains.
    Number1_1 // Number1_2 #= Number1_3,
    

    Indeed fractional division doesn't make sense here, but Prolog would have told you:

    ?- X in 1..5, Y in 1..5, Z #= X // Y.
    X in 1..5,
    X//Y#=Z,
    Y in 1..5,
    Z in 0..5.
    
    ?- X in 1..5, Y in 1..5, Z #= X / Y.
    ERROR: Domain error: `clpfd_expression' expected, found `_G6388/_G6412'