How does the minizinc pentominoes regular constraint example work?

The minizinc benchmarks repository contains several pentomino examples.

width = 5;
height = 4;
filled = 1;
ntiles = 5;
size = 864;
tiles =  [|63,6,1,2,0,
   |9,6,1,2,378,
   |54,6,1,2,432,
   |4,6,1,2,756,
   |14,6,1,2,780,
   |];
dfa = [7,5,5,5,5,3,0,2,2,2,2,2,7,5,5,5,5,3,19,4,4,4,4,3,30,4,4,4,4,3,0,10,10,10,10,10,46,8,8,8,8,0,0,12,12,12,12,13,0,15,15,15,15,14,0,16,16,16,16,16,0,18,18,18,18,17,0,20,20,20,20,20,0,21,21,21,21,21,0,22,22,22,22,22,0,23,23,23,23,23,0,28,28,28,28,0,47,22,22,22,22,22,47,23,23,23,23,23,46,11,11,11,11,24,0,26,26,26,26,26,0,25,25,25,25,25,0,27,27,27,27,25,0,29,29,29,29,26,0,31,31,31,31,31,32,0,0,0,0,0,33,0,0,0,0,0,34,0,0,0,0,0,35,0,0,0,0,0,36,0,0,0,0,0,46,9,9,9,9,6,47,16,16,16,16,16,0,35,35,35,35,0,60,35,35,35,35,0,0,37,37,37,37,39,0,39,39,39,39,39,60,37,37,37,37,39,0,40,40,40,40,40,0,41,41,41,41,41,0,42,42,42,42,42,0,43,43,43,43,43,0,45,45,45,45,45,0,47,47,47,47,47,60,47,47,47,47,47,48,0,0,0,0,0,49,44,44,44,44,0,53,38,38,38,38,38,60,0,0,0,0,0,0,50,50,50,50,50,0,51,51,51,51,0,0,52,52,52,52,52,0,54,54,54,54,54,0,55,55,55,55,55,0,56,56,56,56,56,0,57,57,57,57,57,0,60,60,60,60,0,0,58,58,58,58,58,0,59,59,59,59,59,61,55,55,55,55,0,62,0,0,0,0,0,63,0,0,0,0,0,0,62,62,62,62,0,0,63,63,63,63,0,0,2,2,2,2,2,3,4,3,3,3,3,2,0,2,2,2,2,3,4,3,3,3,3,5,9,5,5,5,5,6,0,6,6,6,6,7,0,7,7,7,7,8,0,8,8,8,8,0,9,0,0,0,0,2,0,2,2,2,2,4,4,14,4,4,5,2,2,0,2,2,2,3,3,10,3,3,5,3,3,12,3,3,5,4,4,14,4,4,5,8,8,0,8,8,0,9,9,0,9,9,13,11,11,0,11,11,11,11,11,22,11,11,11,7,7,15,7,7,11,13,13,0,13,13,13,6,6,15,6,6,0,0,0,22,0,0,0,6,6,25,6,6,0,17,17,29,17,17,16,19,19,0,19,19,19,20,20,0,20,20,20,21,21,0,21,21,21,22,22,0,22,22,0,23,23,0,23,23,24,24,24,0,24,24,24,26,26,0,26,26,0,26,26,27,26,26,0,0,0,27,0,0,0,18,18,29,18,18,0,0,0,30,0,0,0,28,28,0,28,28,0,30,30,0,30,30,0,32,32,0,32,32,32,33,33,0,33,33,33,34,34,0,34,34,0,35,35,0,35,35,35,36,36,0,36,36,36,0,0,37,0,0,0,31,31,40,31,31,0,0,0,45,0,0,0,39,39,0,39,39,39,41,41,0,41,41,41,42,42,0,42,42,42,43,43,0,43,43,0,44,44,0,44,44,44,45,45,0,45,45,0,38,38,46,38,38,0,0,0,50,0,0,0,0,0,51,0,0,0,47,47,0,47,47,47,49,49,0,49,49,49,51,51,0,51,51,0,48,48,52,48,48,0,0,0,53,0,0,0,0,0,54,0,0,0,53,53,0,53,53,0,54,54,0,54,54,0,2,2,0,2,2,2,3,3,3,4,3,3,2,2,2,0,2,2,3,3,3,4,3,3,2,2,2,0,2,2,3,3,3,3,8,3,2,2,2,2,0,2,3,3,3,3,8,3,5,5,5,5,0,5,6,6,6,6,0,6,7,7,7,7,0,7,0,0,0,0,9,0,4,4,4,4,13,4,10,10,10,10,0,10,11,11,11,11,0,11,12,12,12,12,0,12,13,13,13,13,0,13,0,0,0,0,14,0,2,2,2,2,0,2,]

As far as I understand it, the goal is to fill a 5 x 4 board with 5 pentominoes. Some overlap and/or exclusion of tiles seems to be required, which is not usual.

Here is the minizinc solution:

include "globals.mzn";

int: Q = 1;
int: S = 2;
int: Fstart = 3;
int: Fend = 4;
int: Dstart = 5;

int: width;
int: height;
int: filled;
int: ntiles;
int: size;
array[1..ntiles,1..Dstart] of int: tiles;
array[1..size] of int: dfa;

array[1..width*height] of var filled..ntiles+1: board;

constraint forall (h in 1..height, w in 1..width-1) (
    board[(h-1)*width+w] != ntiles+1);

constraint forall (h in 1..height) (
    board[(h-1)*width+width] = ntiles+1);

constraint
  forall (t in 1..ntiles)(
    let {
      int: q = tiles[t,Q],
      int: s = tiles[t,S],
      set of int: f = tiles[t,Fstart]..tiles[t,Fend],
      array[1..q,1..s] of int: d =
        array2d(1..q,1..s,
                [ dfa[i] | i in tiles[t,Dstart]+1..tiles[t,Dstart]+q*s] )
    }
    in
      regular(board,q,s,d,1,f)
  );

solve :: int_search(board, input_order, indomain_min, complete) satisfy;

output [show(board)];

I've not been able to find much documentation on the minizinc benchmarks. They were part of the minizinc challenge for a few years but not anymore.

Chapter 3 of Mikael Lagerkvist's thesis is perhaps partially relevant. It describes placing pentominoes using the regular constraint in the gecode toolkit. Section 3.2 illustrates a string representation for placing the L pentomino using a regular expression string of 0s and 1s: 1s where each square of the board overlaps a square of the L pentomino. Piece rotations are handled in section 3.3 using disjunctions of regular expressions. In general, there are 8 symmetries to consider for each pentomino (2 mirrorings and 4 rotations).

The minizinc data above does not use disjunctions of 8 binary strings to represent pentomino tiles but the minizinc code does use the regular constraint.

I realise gecode and minizinc work differently and in this case minizinc has chosen an alternative to difficult to read and write binary string regular expression disjunctions. The 864 long string of numbers in the dfa variable is probably the core part of the minizinc solution I'm missing. The rest of the solution (removing board symmetries) I can probably figure out after that.

I don't see how to fill a 5 x 4 board with 5 pentominoes without overlaps and/or exclusions. What is the goal of this example?

How does the minizinc pentomino tile and dfa representation work?

How does pentomino rotation and mirroring work in this minizinc representation?

Here is the only board solution from the above code:

[1, 1, 1, 2, 6, 3, 3, 1, 2, 6, 3, 5, 5, 5, 6, 3, 3, 3, 4, 6]

Here is the solution reformatted into a 5 x 4 board:

[1, 1, 1, 2, 6, 
 3, 3, 1, 2, 6, 
 3, 5, 5, 5, 6, 
 3, 3, 3, 4, 6]

Note the 6s. What do they represent?

See this web page for the complete set of 50, 5 x 4 pentomino tilings without overlaps, exclusions or holes.

There are alternative approaches for solving this sort of problem with minizinc. The geost predicate is one possibility. None of these alternatives are relevant for this question.

Alternative software suggestions or discussion, beyond minizinc and gecode, are again not relevant.

Solution

Both the original MiniZinc model and the one in the repository in the comment are ones I wrote. While my licentiate thesis and the linked repository use regular expressions to express the constraints, the original MiniZinc challenge model was written when MiniZinc only had support for DFA inputs, as this is what the regular constraint inside solvers actually use (§). The DFAs were in fact generated by taking the Gecode model and writing a small program (lost to time) that printed out the DFA for the regular expressions in the Gecode example file using the Gecode regular expression to DFA translation. A nice thing about the translation is that for a piece that has some symmetry, the DFA minimization will remove the symmetries. The instance generator linked was written this year, and uses the modern MiniZinc feature that accepts regular expressions. It was just easier to write that way.

So, in order to understand the long list of numbers, you have to view it as a DFA. The list represent a matrix, where the indexes are the states and the next input, and the values are the next state to go to. The other arguments to the regular constraint indicate the number of states, the number of symbols in the alphabet, and the starting and accepting states of the DFA.

As for the 6's at the end of the matrix, these are end-of-line markers. They are there to make sure that a piece is not split apart. Consider the simple piece XXX on a 4 by 4 board with no other pieces (so X is 1, and empty is 0). With the expression 0*1110*, all placements of the piece are modeled, but so are placements like

_ _ _ _
_ _ X X
X X _ _
_ _ _ _

In order to avoid that, an additional end-of-line column is added to the board. in this particular case, the end of line marker is it's own unique value, which means that the model could even handle disjoint pieces. If all pieces are connected and the board is not full, the the end of line markers could be the same as the empty square.

I have a couple of other papers that use a similar construction of placing parts if you find this interesting, as well as the original publication.

Footnote: (§) Technically, most direct implementations of Pesant's algorithm can handle NFAs as well (disregarding epsilon transitions), which could be used to optimize the representation. However, DFA minimization is a known and fast method, while NFA minimization is much much harder. The fewer the states in the FA, the faster the propagation will be.