Let's assume there is an event and within this event there are multiple outcomes that can co-exist. An example would be a tennis game where A plays against B and B serves. A couple of possible outcomes are possible:
The following relationships are evident:
A wins with 60-0
and A wins with 60-15
are both a subset of A wins
.
A wins
and B wins
are disjoints. The same applies for A wins with 60-0
and A wins with 60-15
and A wins with 60-0
and B scores an ace
.
I create a matrix that shows whether the m-column and n-row are disjoints or not. Basically, this matrix shows which events are compatible with each other.
import numpy as np
D = np.array([
[0, 1, 0, 0, 0],
[1, 0, 1, 1, 0],
[0, 1, 0, 1, 1],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0]
])
So, the second row represents the outcome that B
wins. In this case the outcomes, except B scores an ace
are impossible. and therefore 1. The last row represent that B scores an ace
which is possible in every outcome except A wins with 60-0
.
In addition, I have a matrix, S
, that imply which events are subsets of each other.
S = np.array([
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
])
Both A wins with 60-0
or A wins with 60-15
are a subset of A wins
.
I want to create a matrix with all the possible outcomes. In this particular case the matrix would look as follows
X = np.array([
[1, 0, 0, 0, 0], # A wins with 60-30, B scores no ace
[1, 0, 1, 0, 0], # A wins with 60-0
[1, 0, 0, 1, 0], # A wins with 60-15, B scores no ace
[1, 0, 0, 1, 1], # A wins with 60-15, B scores ace
[1, 0, 0, 0, 1], # A wins with 60-30, B scores ace
[0, 1, 0, 0, 0], # B wins with 15-60, B scores no ace
[0, 1, 0, 0, 1] # B wins with 0-60, B scores ace
])
There are a lot more possible outcomes within a tennis match, so I would like to create a function where the variables D
and S
would be the input and X
would be the output.
My attempt so far looks as follows:
import numpy as np
from itertools import product
D = np.array([
[0, 1, 0, 0, 0],
[1, 0, 1, 1, 0],
[0, 1, 0, 1, 1],
[0, 1, 0, 0, 0],
[0, 0, 1, 0, 0]
])
S = np.array([
[0, 0, 1, 1, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
])
def combinations(n: int, D: np.ndarray, S: np.ndarray):
combs = list(product([1, 0], repeat=n))
def disjoint(comb):
for (i, j) in zip(*np.nonzero(D)):
if comb[i] == 1 and comb[j] == 1:
return True
def subset(comb):
for (i, j) in zip(*np.nonzero(S)):
if np.sign(comb[i]) != np.sign(comb[j]):
return True
combs = [comb for comb in combs if not disjoint(comb)]
# combs = [comb for comb in combs if not subset(comb)]
return np.array(combs)
n = 5
c = combinations(n, D, S)
print(c)
The output is partially correct.
[[1 0 1 0 0]
[1 0 0 1 1]
[1 0 0 1 0]
[1 0 0 0 1]
[1 0 0 0 0]
[0 1 0 0 1]
[0 1 0 0 0]
[0 0 1 0 0]
[0 0 0 1 1]
[0 0 0 1 0]
[0 0 0 0 1]
[0 0 0 0 0]]
How can I write the subset-filter in such a way that the wrong combinations are excluded?
In the following answer, 3 solutions are presented:
Solution 1 is the one I came up with first. I think it is the most complex and error-prone, but I left it here for completeness. Solution 2 is what I would probably consider the least error-prone approach. Solution 3 is closest to what has been described as an attempt to a solution in the actual question.
In formulating the problem as a series of implications of events, we can ask ourselves the question:
Which event that happens (or does not happen) implies that another event happens (or does not happen)?
The following events are given:
a
: A winsb
: B winsc
: A wins 60-0d
: A wins 60-15e
: B scores an aceI came up with the following implications (to be read as "row implies column"):
a |
~a |
b |
~b |
c |
~c |
d |
~d |
e |
~e |
|
---|---|---|---|---|---|---|---|---|---|---|
a |
1 | 1 | ||||||||
~a |
1 | 1 | ||||||||
b |
1 | 1 | 1 | 1 | ||||||
~b |
1 | 1 | ||||||||
c |
1 | 1 | 1 | 1 | 1 | |||||
~c |
1 | |||||||||
d |
1 | 1 | 1 | 1 | ||||||
~d |
1 | |||||||||
e |
1 | 1 | ||||||||
~e |
1 |
Take as an example c
's row, which contains the implications:
c → a, c → ~b, c → c, c → ~d, c → ~e
This can be read as:
c → a
: A winning 60-0 implies A winning,c → ~b
: A winning 60-0 implies B not winning,c → c
: A winning 60-0 implies A winning 60-0 (trivial case),c → ~d
: A winning 60-0 implies A not winning 60-15,c → ~e
: A winning 60-0 implies B not scoring an ace.We can skip the trivial cases (the ones implying themselves, on the diagonal) and are left with 14 implications. We can then find all solutions that satisfy the implications programmatically, for example with sympy
(where implications are written with the >>
operator):
from sympy import symbols
from sympy.logic.inference import satisfiable
a, b, c, d, e = symbols("a b c d e")
expr = (( a >> ~b) & (~a >> b) & ( b >> ~a) & ( b >> ~c) & ( b >> ~d) &
(~b >> a) & ( c >> a) & ( c >> ~b) & ( c >> ~d) & ( c >> ~e) &
( d >> a) & ( d >> ~b) & ( d >> ~c) & ( e >> ~c))
for comb in satisfiable(expr, all_models=True):
keys = sorted(comb.keys(), key=lambda s: s.name)
print(", ".join([f"{k.name}={int(comb[k])}" for k in keys]))
# a=1, b=0, c=0, d=0, e=0
# a=1, b=0, c=0, d=0, e=1
# a=1, b=0, c=0, d=1, e=0
# a=1, b=0, c=0, d=1, e=1
# a=1, b=0, c=1, d=0, e=0
# a=0, b=1, c=0, d=0, e=0
# a=0, b=1, c=0, d=0, e=1
We can also start out from a NumPy array that encodes the given table:
import numpy as np
from sympy import symbols, And
from sympy.logic.inference import satisfiable
a = np.asarray([[1, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 1, 0, 1, 0, 0],
[1, 0, 0, 1, 0, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 1, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
[1, 0, 0, 1, 0, 1, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 0, 1, 0, 0, 1, 0],
[0, 0, 0 ,0, 0, 0, 0, 0, 0, 1]])
# Sanity check: square matrix with even number of entries?
assert (a.shape[0] == a.shape[1]) and (not len(a) % 2)
n = len(a) // 2
variables = symbols(f"x0:{n}")
implications = []
for row in range(n):
for col in range(n):
# Optionally, we could also skip the trivial cases (row == col) here
if a[2 * row, 2 * col]:
implications.append(variables[row] >> variables[col])
if a[2 * row, 2 * col + 1]:
implications.append(variables[row] >> ~variables[col])
if a[2 * row + 1, 2 * col]:
implications.append(~variables[row] >> variables[col])
if a[2 * row + 1, 2 * col + 1]:
implications.append(~variables[row] >> ~variables[col])
for comb in satisfiable(And(*implications), all_models=True):
keys = sorted(comb.keys(), key=lambda s: s.name)
print(", ".join([f"{k.name}={int(comb[k])}" for k in keys]))
# x0=1, x1=0, x2=0, x3=0, x4=0
# x0=1, x1=0, x2=0, x3=1, x4=0
# x0=1, x1=0, x2=0, x3=0, x4=1
# x0=1, x1=0, x2=0, x3=1, x4=1
# x0=1, x1=0, x2=1, x3=0, x4=0
# x0=0, x1=1, x2=0, x3=0, x4=0
# x0=0, x1=1, x2=0, x3=0, x4=1
A more straightforward and less error-prone way, I guess, would be looking at all possible combinations of all pairs of variables separately, by means of truth tables.
For each combination of truth values, here, one would ask the question:
Is this combination of truth values possible/feasible or not?
This would lead to the following truth tables, for which we could then look up corresponding logical operations:
a, b
: 00→0, 01→1, 10→1, 11→0 ⇒ corresponding operation: a XOR b
a, c
: 00→1, 01→0, 10→1, 11→1 ⇒ corresponding operation: c → a
(= a OR ~c
)a, d
: 00→1, 01→0, 10→1, 11→1 ⇒ corresponding operation: d → a
(= a OR ~d
)a, e
: 00→1, 01→1, 10→1, 11→1 ⇒ corresponding operation: True
b, c
: 00→1, 01→1, 10→1, 11→0 ⇒ corresponding operation: b NAND c
b, d
: 00→1, 01→1, 10→1, 11→0 ⇒ corresponding operation: b NAND d
b, e
: 00→1, 01→1, 10→1, 11→1 ⇒ corresponding operation: True
c, d
: 00→1, 01→1, 10→1, 11→0 ⇒ corresponding operation: c NAND d
c, e
: 00→1, 01→1, 10→1, 11→0 ⇒ corresponding operation: c NAND e
d, e
: 00→1, 01→1, 10→1, 11→1 ⇒ corresponding operation: True
For a programmatic approach, I guess I would solve the mapping of the truth table to the corresponding operations via a lookup table (there are only 2⁴=16 possible outcomes for truth tables of two variables). In any case, once we
have the operations, we can again use e.g. sympy
for solving:
from sympy import symbols
from sympy.logic.boolalg import Nand, Xor
from sympy.logic.inference import satisfiable
a, b, c, d, e = symbols("a b c d e")
expr = (Xor(a, b) & (c >> a) & (d >> a) & Nand(b, c) &
Nand(b, d) & Nand(c, d) & Nand(c, e))
for comb in satisfiable(expr, all_models=True):
keys = sorted(comb.keys(), key=lambda s: s.name)
print(", ".join([f"{k.name}={int(comb[k])}" for k in keys]))
# a=1, b=0, c=0, d=0, e=0
# a=1, b=0, c=0, d=0, e=1
# a=1, b=0, c=0, d=1, e=0
# a=1, b=0, c=0, d=1, e=1
# a=0, b=1, c=0, d=0, e=0
# a=0, b=1, c=0, d=0, e=1
# a=1, b=0, c=1, d=0, e=0
We can take a step back and reconsider the first solution. Here we can see that the question we originally asked there contains four questions in fact:
- Which event that happens implies that another event does not happen?
- Which event that happens implies that another event happens?
- Which event that does not happen implies that another event happens?
- Which event that does not happen implies that another event does not happen?
Of these four questions, question 4 is redundant, as it is equivalent to question 2 (or vice versa): a → b
is the same as ~b → ~a
. Note that the same is not true for questions 1 and 3: Question 1 encodes a → ~b
, which is the same as b → ~a
; question 3 encodes ~a → b
, which is the same as ~b → a
. This leaves us with three questions to ask – (1, 2, 3) or (1, 3, 4) – which we can encode in three different matrices/tables. The original question actually provides two of them: D
for question 1, S
for question 2. All that is missing is a representative of question 3, which I call N
in the code below.
In the described setting of events, N
adds the following information: A not winning implies B winning, B not winning implies A winning. Crucially, this is not the same as: A winning implies B not winning, B winning implies A not winning (with the latter already being encoded in D
). While this might not be immediately intuitive, one can easily convince oneself by formulating the corresponding truth tables. In any case, here is the corresponding code:
import numpy as np
from sympy import symbols
from sympy.logic.boolalg import And
from sympy.logic.inference import satisfiable
# Row implies not column
D = np.array([[0, 1, 0, 0, 0],
[1, 0, 1, 1, 0],
[0, 1, 0, 1, 1],
[0, 1, 1, 0, 0],
[0, 0, 1, 0, 0]])
# Row implies column, not column implies not row
S = np.array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
# Not row implies column
N = np.array([[0, 1, 0, 0, 0],
[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
# Sanity check: square matrices of same shape?
assert D.shape == S.shape == N.shape and D.shape[0] == D.shape[1]
n = len(D)
variables = symbols(f"x0:{n}")
implications = []
for row in range(n):
for col in range(n):
if D[row, col]:
implications.append( variables[row] >> ~variables[col])
if S[row, col]:
implications.append( variables[row] >> variables[col])
if N[row, col]:
implications.append(~variables[row] >> variables[col])
for comb in satisfiable(And(*implications), all_models=True):
keys = sorted(comb.keys(), key=lambda s: s.name)
print(", ".join([f"{k.name}={int(comb[k])}" for k in keys]))
# x0=1, x1=0, x2=0, x3=0, x4=0
# x0=1, x1=0, x2=0, x3=1, x4=0
# x0=0, x1=1, x2=0, x3=0, x4=0
# x0=0, x1=1, x2=0, x3=0, x4=1
# x0=1, x1=0, x2=0, x3=0, x4=1
# x0=1, x1=0, x2=0, x3=1, x4=1
# x0=1, x1=0, x2=1, x3=0, x4=0
While the table of solution 1 above encoded all four questions (and thus also the redundant one), we skip the redundant question (and thus a potential source of error) with this split into separate matrices/tables. Of course, we could have likewise just skipped the corresponding entries of the single-table solution, but having three different matrices/tables is probably easier to digest.
As a final comment and optimization: Due to the equivalences mentioned above (a → ~b
is equivalent to b → ~a
; ~a → b
is equivalent to ~b → a
), for matrices D
and N
, we actually only need to fill in and consider one of its triangles – it is not a coincidence that both matrices are symmetric (I had to add a missing 1
in D
for that, the one corresponding to A winning 60-15 implying A not winning 60-0). In other words: once an implication at index (i,j)
is given in D
or N
, the corresponding implication at (j,i)
in the same matrix is redundant and thus can be left out. With this, we could potentially fuse matrices D
and N
into a single one (e.g. using the upper triangle for encoding a → ~b
and the lower triangle for encoding ~a → b
), but this would probably sacrifice again legibility for compactness.