algorithmdata-structuresknn

Reverse k-nearest Neighbors - I have no idea on what to do


I have to program the method reverse_k_nearest_neighbors(k, q, A) that calculates the reverse of 𝑘 nearest neighbors in 𝐴 for the point 𝑞 ∊ 𝐴. The method must return an ordered list, where the order is given by the indices of the neighbors in 𝐴.

Thanks to a user here in Stack Overflow, I managed to code the algorithms for k_nearest_neighbors(k, q, A) and all_k_nearest_neighbors(k, A).

import math 

def euclidian_distance(a, b):
    # Calcula la distancia euclidiana entre dos puntos a y b en el espacio bidimensional.
    return math.sqrt((a[0] - b[0])**2 + (a[1] - b[1])**2)

def k_nearest_neighbors(k, q, A):
    """Calcula los k vecinos más cercanos de q en A."""
    # Punto de referencia para el cálculo de distancias.
    ref = A[q]
    distance = []

    # Calcula las distancias entre el punto de referencia y los demás puntos en A.
    for i in range(len(A)):
        if i != q:
            distance.append((euclidian_distance(A[i], ref), i))

    # Ordena las distancias y obtiene los índices de los k vecinos más cercanos.
    distance.sort(key=lambda x: x[0])
    indices = [i for dist, i in distance[:k]]

    # Implementa la ordenación de los índices sin usar sorted().
    for i in range(len(indices) - 1):
        min_index = i
        for j in range(i + 1, len(indices)):
            if indices[j] < indices[min_index]:
                min_index = j
        indices[i], indices[min_index] = indices[min_index], indices[i]

    return indices

# Test
A = [[0.0,0.0], [1.0,1.0], [4.0,1.0], [0.0,3.0], [1.0,3.0]]
assert k_nearest_neighbors(2,0,A) == [1,3]
assert k_nearest_neighbors(2,1,A) == [0,4]
assert k_nearest_neighbors(2,2,A) == [1,4]
assert k_nearest_neighbors(2,3,A) == [1,4]
assert k_nearest_neighbors(2,4,A) == [1,3]

def all_k_nearest_neighbors(k, A):
    """Calcula los k vecinos más cercanos de A."""
    results = []
    
    for i in range(len(A)):
        results.append(k_nearest_neighbors(k, i, A))
    
    return results

# Test
A = [[0.0, 0.0], [1.0, 1.0], [4.0, 1.0], [0.0, 3.0], [1.0, 3.0]]
assert all_k_nearest_neighbors(2, A) == [[1, 3], [0, 4], [1, 4], [1, 4], [1, 3]]

The code passes all tests. By the way, if it seems a convulated way of doing things, it's because I can't use most Python methods, only list, sort and append.

However, as the title says, I have to program a method reverse_k_nearest_neighbors(k, q, A) and I have no clue how to do it. None of the codes I try with pass the tests:

# Test
A = [[0.0,0.0], [1.0,1.0], [4.0,1.0], [0.0,3.0], [1.0,3.0]]
assert reverse_k_nearest_neighbors(2,0,A) == [1]
assert reverse_k_nearest_neighbors(2,1,A) == [0,2,3,4]
assert reverse_k_nearest_neighbors(2,2,A) == []
assert reverse_k_nearest_neighbors(2,3,A) == [0,4]
assert reverse_k_nearest_neighbors(2,4,A) == [1,2,3]

Solution

  • A naive implementation will call k_nearest_neighbors on every index of the input list, and check for each returned list whether q is in that list. If it is, it belongs to the reverse 𝑘-nearest neighbors.

    So:

    def reverse_k_nearest_neighbors(k, q, A):
        # naive implementation: get knn of all points in A and see which have q
        return [
            i for i in range(len(A))
                if q in k_nearest_neighbors(k, i, A)
        ]
    

    There are ways to improve the efficiency of this algorithm, but I leave it for the reader to investigate further. See for instance, Improving Reverse k Nearest Neighbors Queries - Lixin Ye