javaoptaplanner

Order of facts in ProblemFactCollectionProperty matters for Solution in Optaplanner


I have problem where I need to assign java Money values to my entities with this very basic idea in my PlanningEntity:

@ProblemFactCollectionProperty
@ValueRangeProvider(id = "amountFacts")
private Set<Money> amountFacts;

@PlanningVariable(valueRangeProviderRefs = "amountFacts")
private Money amount;

Most of the time this is working just fine, however, I've run into a specific problem were the planner fails to find a (possible) working solution.

The strange part: Optaplanner does find the solution if I replace the Set with a List. I've double checked and the facts stay the same. If I sort the List it once again fails to find the solution. So if the solution is found sometimes seems to depend on the order of the facts in the fact collection.

Increasing the timeouts to give the planner more time does not help unfortunately.


Solution

  • OptaPlanner (as well as Timefold Solver, the continuation of OptaPlanner), in its default configuration, is entirely reproducible. If you run it two times with the same configuration on the same dataset, you should get two identical results. However, this is not the case with data stored in Java Set.

    In Java, a Set has an undefined iteration order. Which means that, for a set of A and B, sometimes A will come first, and other times it will be B. The solver reads those items sequentially, and the way it chooses through the search space will be defined by this iteration order. And a certain iteration order may lead to finding a good solution, where a different iteration order may find the same solution later, or perhaps never.

    Even a solver which is configured to be totally reproducible in fact isn't reproducible, because the iteration order of Set is not reproducible. With a List, you have a reproducible iteration order, and by pure coincidence in your particular situation, that iteration order leads to a good solution whereas some other iteration order did not.

    This is effectively the same thing as choosing different random seeds for the solver. If you try that, you will find that certain seeds lead to better results than others. That said, there is no way of predicting which seed will be best. The way to minimize this issue is to trust the algorithm - configure the solver with a diverse selection of moves, make sure you have no score traps in your constraints, and give it enough time to search.