[SOLVED] Heuristic to choose five column arrays that maximise the dot product

First of all, thanks for a good question. I don't get to practice numpy that often. Also, I don't have much experience in posting to SE, so any feedback, code critique, and opinions relating to the answer are welcome.

This was an attempt at finding an optimal solution at first, but I didn't manage to deal with the complexity. The algorithm should, however, give you a greedy solution that might prove to be adequate.

Colab Notebook (Python code + Octave validation)

Core Idea

Note: During runtime, I've transposed the matrix. So, the column vectors in the question correspond to row vectors in the algorithm.

Notice that you can multiply the target with one vector at a time, effectively getting a new target, but with some 0s in it. These will never change, so you can filter out some computations by removing those rows (columns, in the algorithm) in further computations entirely - both from the target and the matrix. - you're then left with a valid target again (only 1s and -1 in it).

That's the basic idea of the algorithm. Given:

n: number of vectors you need to pick
b: number of best vectors to check
m: complexity of matrix operations to check one vector

Do an exponentially-complex O((n*m)^b) depth-first search, but decrease the complexity of the calculations in deeper layers by reducing target/matrix size, while cutting down a few search paths with some heuristics.

Heuristics used

The best score achieved so far is known in every recursion step. Compute an optimistic vector (turn -1 to 0) and check what scores can still be achieved. Do not search in levels where the score cannot be surpassed.

This is useless if the best vectors in the matrix have 1s and 0s equally distributed. The optimistic scores are just too high. However, it gets better with more sparsity.
Ignore duplicates. Basically, do not check duplicate vectors in the same layer. Because we reduce the matrix size, the chance for ending up with duplicates increases in deeper recursion levels.

Further Thoughts on Heuristics The most valuable ones are those that eliminate the vector choices at the start. There's probably a way to find vectors that are worse-or-equal than others, with respect to their affects on the target. Say, if v1 only differs from v2 by an extra 1, and target has a -1 in that row, then v1 is worse-or-equal than v2.

The problem is that we need to find more than 1 vector, and can't readily discard the rest. If we have 10 vectors, each worse-or-equal than the one before, we still have to keep 5 at the start (in case they're still the best option), then 4 in the next recursion level, 3 in the following, etc.

Maybe it's possible to produce a tree and pass it on in into recursion? Still, that doesn't help trim down the search space at the start... Maybe it would help to only consider 1 or 2 of the vectors in the worse-or-equal chain? That would explore more diverse solutions, but doesn't guarantee that it's more optimal.

Warning: Note that the MATRIX and TARGET in the example are in int8. If you use these for the dot product, it will overflow. Though I think all operations in the algorithm are creating new variables, so are not affected.

Code

# Given:
TARGET = np.random.choice([1, -1], size=60000).astype(np.int8)
MATRIX = np.random.randint(0, 2, size=(10000,60000), dtype=np.int8)

# Tunable - increase to search more vectors, at the cost of time.
# Performs better if the best vectors in the matrix are sparse
MAX_BRANCHES = 3   # can give more for sparser matrices

# Usage
score, picked_vectors_idx = pick_vectors(TARGET, MATRIX, 5)

# Function
def pick_vectors(init_target, init_matrix, vectors_left_to_pick: int, best_prev_result=float("-inf")):
    assert vectors_left_to_pick >= 1
    if init_target.shape == (0, ) or len(init_matrix.shape) <= 1 or init_matrix.shape[0] == 0 or init_matrix.shape[1] == 0:
        return float("inf"), None
    target = init_target.copy()
    matrix = init_matrix.copy()

    neg_matrix = np.multiply(target, matrix)
    neg_matrix_sum = neg_matrix.sum(axis=1)

    if vectors_left_to_pick == 1:
        picked_id = np.argmax(neg_matrix_sum)
        score = neg_matrix[picked_id].sum()
        return score, [picked_id]

    else:
        sort_order = np.argsort(neg_matrix_sum)[::-1]
        sorted_sums = neg_matrix_sum[sort_order]
        sorted_neg_matrix = neg_matrix[sort_order]
        sorted_matrix = matrix[sort_order]

        best_score = best_prev_result
        best_picked_vector_idx = None

        # Heuristic 1 (H1) - optimistic target.
        # Set a maximum score that can still be achieved
        optimistic_target = target.copy()
        optimistic_target[target == -1] = 0
        if optimistic_target.sum() <= best_score:
            # This check can be removed - the scores are too high at this point
            return float("-inf"), None

        # Heuristic 2 (H2) - ignore duplicates
        vecs_tried = set()

        # MAIN GOAL:   for picked_id, picked_vector in enumerate(sorted_matrix):
        for picked_id, picked_vector in enumerate(sorted_matrix[:MAX_BRANCHES]):
            # H2
            picked_tuple = tuple(picked_vector)
            if picked_tuple in vecs_tried:
                continue
            else:
                vecs_tried.add(picked_tuple)

            # Discard picked vector
            new_matrix = np.delete(sorted_matrix, picked_id, axis=0)
            
            # Discard matrix and target rows where vector is 0
            ones = np.argwhere(picked_vector == 1).squeeze()
            new_matrix = new_matrix[:, ones]
            new_target = target[ones]
            if len(new_matrix.shape) <= 1 or new_matrix.shape[0] == 0:
                return float("-inf"), None

            # H1: Do not compute if best score cannot be improved
            new_optimistic_target = optimistic_target[ones]
            optimistic_matrix = np.multiply(new_matrix, new_optimistic_target)
            optimistic_sums = optimistic_matrix.sum(axis=1)
            optimistic_viable_vector_idx = optimistic_sums > best_score
            if optimistic_sums.max() <= best_score:
                continue
            new_matrix = new_matrix[optimistic_viable_vector_idx]
         
            score, next_picked_vector_idx = pick_vectors(new_target, new_matrix, vectors_left_to_pick - 1, best_prev_result=best_score)
            
            if score <= best_score:
                continue

            # Convert idx of trimmed-down matrix into sorted matrix IDs
            for i, returned_id in enumerate(next_picked_vector_idx):
                # H1: Loop until you hit the required number of 'True'
                values_passed = 0
                j = 0
                while True:
                    value_picked: bool = optimistic_viable_vector_idx[j]
                    if value_picked:
                        values_passed += 1
                        if values_passed-1 == returned_id:
                            next_picked_vector_idx[i] = j
                            break
                    j += 1

                # picked_vector index
                if returned_id >= picked_id:
                    next_picked_vector_idx[i] += 1

            best_score = score

            # Convert from sorted matrix to input matrix IDs before returning
            matrix_id = sort_order[picked_id]
            next_picked_vector_idx = [sort_order[x] for x in next_picked_vector_idx]
            best_picked_vector_idx = [matrix_id] + next_picked_vector_idx

        return best_score, best_picked_vector_idx