azure-cognitive-searchazure-search-.net-sdk

How to filter an array in Azure Search


I have following Data in my Index,

{
    "name" : "The 100",
    "lists" : [
                    "2c8540ee-85df-4f1a-b35f-00124e1d3c4a;Bellamy",
                    "2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike",
                    "2c8540ee-85df-4f1a-b35f-00155c02e581;Clark"
              ]
    
}

I have to get all the documents where the lists has Pike in it.

Though a full search query works with Any I could't get the contains work.

$filter=lists/any(t: t eq '2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike')

However i am not sure how to search only with Pike.

$filter=lists/any(t: t eq 'Pike')

I guess the eq looks for a full text search, is there any way with the given data structure I should make this query work.

Currently the field lists has no searchable property only the filterable property.


Solution

  • The eq operator looks for exact, case-sensitive matches. That's why it doesn't match 'Pike'. You need to structure your index such that terms like 'Pike' can be easily found. You can accomplish this in one of two ways:

    1. Separate the GUIDs from the names when you index documents. So instead of indexing "2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike" as a single string, you could index them as separate strings in the same array, or perhaps in two different collection fields (one for GUIDs and one for names) if you need to correlate them by position.
    2. If the field is searchable, you can use the new search.ismatch function in your filter. Assuming the field is using the standard analyzer, full-text search will word-break on the semicolons, so you should be able to search just for "Pike" and get a match. The syntax would look like this: $filter=search.ismatch('Pike', 'lists') (If looking for "Pike" is all your filter does, you can just use the search and searchFields parameters to the Search API instead of $filter.) If the "lists" field is not already searchable, you will need to either add a new field and re-index the "lists" values, or re-create your index from scratch with the new field definition.

    Update

    There is a new approach to solve this type of problem that's available in API versions 2019-05-06 and above. You can now use complex types to represent structured data, including in collections. For the original example, you could structure the data like this:

    {
    "name" : "The 100",
    "lists" : [
                    { "id": "2c8540ee-85df-4f1a-b35f-00124e1d3c4a", "name": "Bellamy" },
                    { "id": "2c8540ee-85df-4f1a-b35f-00155c40f11c", "name": "Pike" },
                    { "id": "2c8540ee-85df-4f1a-b35f-00155c02e581", "name": "Clark" }
              ]
    }
    

    And then directly query for the name sub-field like this:

    $filter=lists/any(l: l/name eq 'Pike')
    

    The documentation for complex types is here.