How to batch multiple vector queries to Azure AI search vector index API

I am running a similarity search for a given vector in Azure AI search. The below code submits one target vector and returns a set of the 5 closest matches.

I would like to submit a set of N target vectors and receive back N sets of the top 5 closest matches for each vector. When I submit multiple vectors inside the VectorQueries, it returns one set of the 5 closest vectors as explained in the docs.

I have tried parallelizing the queries and submitting one request per vector, but would like to avoid sending multiple requests if it can be done with one. Is there a way to do this?

# Construct the search URL
url = f"https://{service_name}.search.windows.net/indexes/{index_name}/docs/search?api-version={api_version}"

# Construct the search payload
payload = {
    "vectorQueries": [{
        "kind": "vector",
        "fields": "embedding",
        "vector": target_vector
    }],
    "top": 5,
}

# Set the headers
headers = {
    'Content-Type': 'application/json',
    'api-key': api_key
}

# Send the request
response = requests.post(url, headers=headers, data=json.dumps(payload))

I tried

payload = {
        "vectorQueries": [{
            "kind": "vector",
            "fields": "embedding",
            "vector": target_vector1
        },
        {
            "kind": "vector",
            "fields": "embedding",
            "vector": target_vector2
        }],
        "top": 5,
    }

but it returns one set of 5 embeddings. I am looking for a way to get one set for target_vector1 and another for target_vector2

Solution

Multiple vector queries don't give you more than one results for single request, it just gives you the matching documents for the input vector or request you made.

If you query single request with multiple vectors, you will get single response, with similar documents only.

You can see the sample mentioned here.

When you send single query with two different embeddings to index having two vectors myImageVector and myTextVector, it does the similarity search parallelly and at the end produces results scored using Reciprocal Rank Fusion (RRF).

The same is happening in your case.

payload = {
        "vectorQueries": [{
            "kind": "vector",
            "fields": "embedding",
            "vector": target_vector1
        },
        {
            "kind": "vector",
            "fields": "embedding",
            "vector": target_vector2
        }],
        "top": 5,
    }

When you are sending target_vector1 and target_vector2 it does similarity search on the field embedding parallelly and gives you the results scored using Reciprocal Rank Fusion (RRF).

If you want the 2 different results for each target_vector you need to make request separately like below.

url = f"https://{name}.search.windows.net/indexes/vector-1729160100909/docs/search?api-version=2024-07-01"

# Construct the search payload
res=[]

for i in [target_vector1,target_vector2]:
    payload = {
        "select":"chunk,title",
        "vectorQueries": [{
            "kind": "vector",
            "fields": "text_vector",
            "vector": i
        }],
        "top": 5,
    }
    headers = {
    'Content-Type': 'application/json',
    'api-key': "mazxcvvnvnJAzSeAN3Ox3"}

    response = requests.post(url, headers=headers, data=json.dumps(payload))
    res.append(response.json())

for j in res:
    print(j['value'])

Output:

[{'@search.score': 0.49568734, 'chunk': '\'Best-selling products are often overhyped, but "excellent" quality stands out 9741".\'', 'title': 'doc2.txt'}, {'@search.score': 0.49395812, 'chunk': '"Innovative" technology is a "game-changer". "Limited edition" items are popular.', 'title': 'doc3.txt'}]
[{'@search.score': 0.4964863, 'chunk': '"Innovative" technology is a "game-changer". "Limited edition" items are popular.', 'title': 'doc3.txt'}, {'@search.score': 0.49611893, 'chunk': '\'Best-selling products are often overhyped, but "excellent" quality stands out 9741".\'', 'title': 'doc2.txt'}]