I am trying to use PuLP on dummy graph coloring problem. To solve the problem, we need to assign a colour to each node such that any adjacent nodes don't share a common colour while minimising the total number of colours used.
Here's a dummy data:
edges = [('A', 'H'), ('A', 'B'), ('H', 'G'), ('H', 'K'), ('H', 'J'),
('H', 'I'), ('G', 'F'), ('G', 'K'), ('F', 'E'), ('K', 'E'),
('K', 'J'), ('K', 'D'), ('I', 'J'), ('I', 'D'), ('D', 'C'),
('D', 'B')]
nodes = set(sum(edges, ()))
n_nodes = len(nodes)
Using Google's OR-Tools, I can find a solution:
from ortools.sat.python import cp_model
model = cp_model.CpModel()
# Set decision variables
var = {}
for node in nodes:
var[node] = model.NewIntVar(0, n_nodes-1, node)
# Set objective
model.Minimize(len(set(var.values())))
# Add constraints
for edge in edges:
model.Add(var[edge[0]] != var[edge[1]])
# Solve
solver = cp_model.CpSolver()
status = solver.Solve(model)
if status == cp_model.OPTIMAL:
print('Status: Optimal \n')
for node in nodes:
print(f"{node}: {solver.Value(var[node])}")
Returns:
Status: Optimal
C: 0
A: 5
H: 4
K: 2
F: 1
E: 0
D: 1
J: 0
I: 2
B: 0
G: 3
Side note: While the status says optimal, I feel less colours could be used.
However, when using the similar approach with PuLP, I am unable to see any solution.
from pulp import *
model = LpProblem("coloring", LpMinimize)
# Set decision variables
var = {}
for node in nodes:
var[node] = LpVariable(node, lowBound=0, upBound=n_nodes-1, cat=LpInteger)
# Set objective
model += len(set(var.values()))
# Add constraints
for edge in edges:
model += var[edge[0]] != var[edge[1]]
# Solve
status = model.solve()
print(f'Status: {LpStatus[status]} \n')
for variable in prob.variables():
print(f"{variable.name}, {v.varValue}")
Returns:
Status: Optimal
__dummy, None
Any help on where I have gone wrong would be greatly appreciated.
So you've got 2 issues in your formulation that are causing you problems. First the len(set(...))
is a non-linear operation. when you add expressions to the problem they need to be expressed as linear equations. Your constraint has the same issue. Your logic is correct, but you cannot pass x != y
to the solver, that is also non-linear. You have to look at how to formulate these pieces of logic with linear expressions to reformulate.
Note, that you can always print(model)
in pulp to see what was built, if you print yours, nothing useful comes up... who knows how these things are evaluated and put into the model. I suspect they are evaluated once at construction and you are throwing trivial expressions into the model like x = True or something, and not getting errors.
So here is an alternate formulation of your problem with linear expressions. The this_color != that_color
constraint is a little complex and relies on an indicator variable to enable the either-or construct.
I don't use Google's OR Tools much, but it is mostly just syntactic sugar on top of these concepts, so perhaps there is a way to elegantly state it in OR Tools, but at the end of the day, something like this is getting generated and passed to the solver.
There may be a better formulation of this in graph theory, but this works fine for simple ILP. 3 colors is the answer.
from pulp import *
edges = [('A', 'H'), ('A', 'B'), ('H', 'G'), ('H', 'K'), ('H', 'J'),
('H', 'I'), ('G', 'F'), ('G', 'K'), ('F', 'E'), ('K', 'E'),
('K', 'J'), ('K', 'D'), ('I', 'J'), ('I', 'D'), ('D', 'C'),
('D', 'B')]
nodes = set(sum(edges, ()))
n_nodes = len(nodes)
M = n_nodes + 1
model = LpProblem("coloring", LpMinimize)
# Set decision variables
color = {}
for node in nodes:
color[node] = LpVariable(node, lowBound=1, cat=LpInteger)
num_colors = LpVariable('count', cat = LpInteger)
color_diff = {}
for idx, edge in enumerate(edges):
color_diff[edge] = LpVariable(str(idx), cat=LpBinary)
# Set objective
# model += len(set(var.values()))
model += num_colors # seek to minimize the maximal color index
# Add constraints
# get the max value of the colors...
for node in nodes:
model += num_colors >= color[node]
# ensure the adjacent nodes are different color
for edge in edges:
# force one or the other of these to be true, based on the color_diff
# variable, which is just a binary indicator
model += -M * color_diff[edge] + 1 <= color[edge[0]] - color[edge[1]]
model += -M * (1-color_diff[edge]) + 1 <= color[edge[1]] - color[edge[0]]
# Solve
status = model.solve()
print(f'Status: {LpStatus[status]} \n')
for variable in model.variables():
print(f"{variable.name}, {variable.varValue}")
Result - Optimal solution found
Objective value: 3.00000000
Enumerated nodes: 0
Total iterations: 0
Time (CPU seconds): 0.01
Time (Wallclock seconds): 0.01
Option for printingOptions changed from normal to all
Total time (CPU seconds): 0.01 (Wallclock seconds): 0.01
Status: Optimal
0, 1.0
1, 0.0
10, 0.0
11, 1.0
12, 0.0
13, 1.0
14, 0.0
15, 0.0
2, 0.0
3, 0.0
4, 0.0
5, 0.0
6, 1.0
7, 1.0
8, 0.0
9, 0.0
A, 2.0
B, 1.0
C, 1.0
D, 3.0
E, 1.0
F, 3.0
G, 1.0
H, 3.0
I, 2.0
J, 1.0
K, 2.0
count, 3.0
[Finished in 96ms]