I'm trying to solve a zero-sum game finding the optimal probability distribution for player I. To do so I'm using scipy linprog simplex method.
I've seen an example, I need to transform this game :
G=np.array([
[ 0 2 -3 0]
[-2 0 0 3]
[ 3 0 0 -4]
[ 0 -3 4 0]])
into this linear optimization problem:
Maximize z
Subject to: 2*x2 - 3*x3 + z <= 0
-2*x1 + + 3*x4 + z <= 0
3*x1 + - 4*x4 + z <= 0
- 3*x2 + 4*x3 + z <= 0
with x1 + x2 + x3 + x4 = 1
Here's my actual code:
def simplex(G):
(n,m) = np.shape(G)
A_ub = np.transpose(G)
# we add an artificial variable to maximize, present in all inequalities
A_ub = np.append(A_ub, np.ones((m,1)), axis = 1)
# all inequalities should be inferior to 0
b_ub = np.zeros(m)
# the sum of all variables except the artificial one should be equal to one
A_eq = np.ones((1,n+1))
A_eq[0][n] = 0
b_eq = np.ones(1)
c = np.zeros(n + 1)
# -1 to maximize the artificial variable we're going to add
c[n] = -1
res = linprog(c, A_ub=A_ub, b_ub=b_ub, A_eq=A_eq, b_eq=b_eq, bounds=(0,None))
return (res.x[:-1], res.fun)
Here's the distribution I get :
[5.87042987e-01 1.77606350e-10 2.79082859e-10 4.12957014e-01]
which does sum up to 1, but I expect
[0 0.6 0.4 0]
I'm trying on a larger game with 6 or 7 lines (and so variables) and it doesn't even sum up to 1.. What did I do wrong ?
Thanks for any help you could provide.
I haven't updated the post since I found a solution. I recommend not using Scipy linprog function, it's badly documented if you don't know much about linear programming, and I found it to be unprecise and inconsistent on numerous examples (and I did try adding a negative sign at that time, as suggested by oyamad).
I switched to the PuLP python library, and had no problem having consistent results from the get-go.