embeddingyahoo-apivespavector-databasesemantic-search

How to run nearest neighbor search in vespa?


Trying to fetch closest neighbor for my given embedding, using below query:

vespa query -v 'yql=select text from VectorSearch3_content where {targetHits:10}nearestNeighbor(embedding,q)' 'hits=1' 'ranking=closeness' 'input.query(q)=$Q'

Getting attached error.enter image description here

Do I need to define closeness somewhere. If so how and where to do it via pyvespa?

Tried fetching nearest neighbor record but got server error "message": "No profile named 'closeness' exists in schemas [VectorSearch3]"


Solution

  • The error means that the rank profile "closeness" is not defined. https://docs.vespa.ai/en/nearest-neighbor-search-guide.html#schema has an example of rank profile use.

    For a simple pyvespa example, take a look at https://pyvespa.readthedocs.io/en/latest/examples/pyvespa-examples.html, where this snippet adds a rank profile to the schema:

    app_package.schema.add_rank_profile(
        RankProfile(
            name = "max_distance",
            inputs = [("query(qpoint)", "tensor<float>(d[3])")],
            first_phase = "euclidean_distance(attribute(point), query(qpoint), d)"
        )
    )
    

    In your example, something like (I don't know your tensor type / use correct field names)

    app_package.schema.add_rank_profile(
        RankProfile(
            name = "myrankprofile",
            inputs = [("query(q)", "tensor<float>(d[3])")],
            first_phase = "closeness(field, embedding)"
        )
    )
    

    for less confusion, one can call the rank profile something other than closeness and use 'ranking=myrankprofile' in the query - hope this helps!