azureazure-cognitive-search

Azure Search AI - Delete documents from API Rest


I delete documents from API Rest of Azure Search AI, the free space of data decrease, but vector quota still stay in 100%...

How free up this space?? we need delete the vectors copy of the documents that deletes.

The soft delete only work for blobs and files.. how track the delections fo the documents for free up space?


Solution

  • The only reliable way to actually free up vector space is to rebuild the index. Here’s the general approach:

    You can also use an Azure Function (Python or C#) to automate this cleanup process. It can run on a schedule (e.g., once a week) and keep your index from hitting the vector quota ceiling.

    import os
    import logging
    from azure.search.documents import SearchClient
    from azure.search.documents.indexes import SearchIndexClient
    from azure.search.documents.indexes.models import SearchIndex
    from azure.core.credentials import AzureKeyCredential
    
    def main(mytimer: func.TimerRequest) -> None:
        search_service = os.getenv("AZURE_SEARCH_SERVICE")
        admin_key = os.getenv("AZURE_SEARCH_ADMIN_KEY")
        index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")
        
        endpoint = f"https://{search_service}.search.windows.net"
        credential = AzureKeyCredential(admin_key)
    
        index_client = SearchIndexClient(endpoint=endpoint, credential=credential)
        search_client = SearchClient(endpoint=endpoint, index_name=index_name, credential=credential)
        index = index_client.get_index(index_name)
        schema = index.serialize()
        documents = list(search_client.search("*"))
        index_client.delete_index(index_name)
        index_client.create_index(SearchIndex.deserialize(schema))
        search_client = SearchClient(endpoint=endpoint, index_name=index_name, credential=credential)
        search_client.upload_documents(documents=[doc for doc in documents])
    

    Suggestion:- Add a field like isDeleted or status to your data model so you can easily filter out deleted records when rebuilding. Also, consider using index versioning (e.g., myindex_v1, myindex_v2) in production to avoid downtime and manage reindexing more smoothly.

    For more details, please refer the following Microsoft documentation: Vector index limits, Remove source vectors, update or rebuild an index