I have a sparse 60000x10000 matrix M where each element is either a 1 or 0. Each column in the matrix is a different combination of signals (ie. 1s and 0s). I want to choose five column vectors from M and take the Hadamard (ie. element-wise) product of them; I call the resulting vector the strategy vector. After this step, I compute the dot product of this strategy vector with a target vector (that does not change). The target vector is filled with 1s and -1s such that having a 1 in a specific row of the strategy vector is either rewarded or penalised.
Is there some heuristic or linear algebra method that I could use to help me pick the five vectors from the matrix M that result in a high dot product? I don't have any experience with Google's OR tools nor Scipy's optimization methods so I am not too sure if they can be applied to my problem. Advice on this would be much appreciated! :)
Note: the five column vectors given as the solution does not need to be the optimal one; I'd rather have something that does not take months/years to run.
First of all, thanks for a good question. I don't get to practice numpy that often. Also, I don't have much experience in posting to SE, so any feedback, code critique, and opinions relating to the answer are welcome.
This was an attempt at finding an optimal solution at first, but I didn't manage to deal with the complexity. The algorithm should, however, give you a greedy solution that might prove to be adequate.
Colab Notebook (Python code + Octave validation)
Core Idea
Note: During runtime, I've transposed the matrix. So, the column vectors in the question correspond to row vectors in the algorithm.
Notice that you can multiply the target with one vector at a time, effectively getting a new target, but with some 0s
in it. These will never change, so you can filter out some computations by removing those rows (columns, in the algorithm) in further computations entirely - both from the target and the matrix. - you're then left with a valid target again (only 1s
and -1
in it).
That's the basic idea of the algorithm. Given:
n
: number of vectors you need to pickb
: number of best vectors to checkm
: complexity of matrix operations to check one vectorDo an exponentially-complex O((n*m)^b)
depth-first search, but decrease the complexity of the calculations in deeper layers by reducing target/matrix size, while cutting down a few search paths with some heuristics.
Heuristics used
The best score achieved so far is known in every recursion step. Compute an optimistic vector (turn -1
to 0
) and check what scores can still be achieved. Do not search in levels where the score cannot be surpassed.
This is useless if the best vectors in the matrix have 1s
and 0s
equally distributed. The optimistic scores are just too high. However, it gets better with more sparsity.
Ignore duplicates. Basically, do not check duplicate vectors in the same layer. Because we reduce the matrix size, the chance for ending up with duplicates increases in deeper recursion levels.
Further Thoughts on Heuristics
The most valuable ones are those that eliminate the vector choices at the start. There's probably a way to find vectors that are worse-or-equal than others, with respect to their affects on the target. Say, if v1
only differs from v2
by an extra 1
, and target has a -1
in that row, then v1
is worse-or-equal than v2
.
The problem is that we need to find more than 1 vector, and can't readily discard the rest. If we have 10 vectors, each worse-or-equal than the one before, we still have to keep 5 at the start (in case they're still the best option), then 4 in the next recursion level, 3 in the following, etc.
Maybe it's possible to produce a tree and pass it on in into recursion? Still, that doesn't help trim down the search space at the start... Maybe it would help to only consider 1 or 2 of the vectors in the worse-or-equal chain? That would explore more diverse solutions, but doesn't guarantee that it's more optimal.
Warning: Note that the MATRIX and TARGET in the example are in int8
. If you use these for the dot product, it will overflow. Though I think all operations in the algorithm are creating new variables, so are not affected.
Code
# Given:
TARGET = np.random.choice([1, -1], size=60000).astype(np.int8)
MATRIX = np.random.randint(0, 2, size=(10000,60000), dtype=np.int8)
# Tunable - increase to search more vectors, at the cost of time.
# Performs better if the best vectors in the matrix are sparse
MAX_BRANCHES = 3 # can give more for sparser matrices
# Usage
score, picked_vectors_idx = pick_vectors(TARGET, MATRIX, 5)
# Function
def pick_vectors(init_target, init_matrix, vectors_left_to_pick: int, best_prev_result=float("-inf")):
assert vectors_left_to_pick >= 1
if init_target.shape == (0, ) or len(init_matrix.shape) <= 1 or init_matrix.shape[0] == 0 or init_matrix.shape[1] == 0:
return float("inf"), None
target = init_target.copy()
matrix = init_matrix.copy()
neg_matrix = np.multiply(target, matrix)
neg_matrix_sum = neg_matrix.sum(axis=1)
if vectors_left_to_pick == 1:
picked_id = np.argmax(neg_matrix_sum)
score = neg_matrix[picked_id].sum()
return score, [picked_id]
else:
sort_order = np.argsort(neg_matrix_sum)[::-1]
sorted_sums = neg_matrix_sum[sort_order]
sorted_neg_matrix = neg_matrix[sort_order]
sorted_matrix = matrix[sort_order]
best_score = best_prev_result
best_picked_vector_idx = None
# Heuristic 1 (H1) - optimistic target.
# Set a maximum score that can still be achieved
optimistic_target = target.copy()
optimistic_target[target == -1] = 0
if optimistic_target.sum() <= best_score:
# This check can be removed - the scores are too high at this point
return float("-inf"), None
# Heuristic 2 (H2) - ignore duplicates
vecs_tried = set()
# MAIN GOAL: for picked_id, picked_vector in enumerate(sorted_matrix):
for picked_id, picked_vector in enumerate(sorted_matrix[:MAX_BRANCHES]):
# H2
picked_tuple = tuple(picked_vector)
if picked_tuple in vecs_tried:
continue
else:
vecs_tried.add(picked_tuple)
# Discard picked vector
new_matrix = np.delete(sorted_matrix, picked_id, axis=0)
# Discard matrix and target rows where vector is 0
ones = np.argwhere(picked_vector == 1).squeeze()
new_matrix = new_matrix[:, ones]
new_target = target[ones]
if len(new_matrix.shape) <= 1 or new_matrix.shape[0] == 0:
return float("-inf"), None
# H1: Do not compute if best score cannot be improved
new_optimistic_target = optimistic_target[ones]
optimistic_matrix = np.multiply(new_matrix, new_optimistic_target)
optimistic_sums = optimistic_matrix.sum(axis=1)
optimistic_viable_vector_idx = optimistic_sums > best_score
if optimistic_sums.max() <= best_score:
continue
new_matrix = new_matrix[optimistic_viable_vector_idx]
score, next_picked_vector_idx = pick_vectors(new_target, new_matrix, vectors_left_to_pick - 1, best_prev_result=best_score)
if score <= best_score:
continue
# Convert idx of trimmed-down matrix into sorted matrix IDs
for i, returned_id in enumerate(next_picked_vector_idx):
# H1: Loop until you hit the required number of 'True'
values_passed = 0
j = 0
while True:
value_picked: bool = optimistic_viable_vector_idx[j]
if value_picked:
values_passed += 1
if values_passed-1 == returned_id:
next_picked_vector_idx[i] = j
break
j += 1
# picked_vector index
if returned_id >= picked_id:
next_picked_vector_idx[i] += 1
best_score = score
# Convert from sorted matrix to input matrix IDs before returning
matrix_id = sort_order[picked_id]
next_picked_vector_idx = [sort_order[x] for x in next_picked_vector_idx]
best_picked_vector_idx = [matrix_id] + next_picked_vector_idx
return best_score, best_picked_vector_idx