ruby-on-railsrubyelasticsearchtire

Using facets with missing filters in ElasticSearch and Tire


I am trying to use facets with a search query that includes a missing filter and it does not seem to take into account the records that will be filtered away with the missing filter.

In the Rails app:

@items = Item.search(per_page: 100, page: params[:page], load: true) do |search|
  search.query do |query|
    query.boolean do |boolean|
      boolean.must { |must| must.string params[:q], default_operator: "AND" }
      boolean.must { |must| must.term :project_id, @project.id }
    end
    query.filtered do
      filter :missing , :field =>  :user_id
    end
  end
  search.facet('tags') do
    terms :tag
  end
end

generates the request:

curl -X GET 'http://localhost:9200/items/user_story/item?load=true&size=100&pretty' -d '{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "*",
            "default_operator": "AND"
          }
        },
        {
          "term": {
            "project_id": {
              "term": 132
            }
          }
        }
      ]
    },
    "filtered": {
      "filter": {
        "and": [
          {
            "missing": {
              "field": "user_id"
            }
          }
        ]
      }
    }
  },
  "facets": {
    "tags": {
      "terms": {
        "field": "tag",
        "size": 10,
        "all_terms": false
      }
    }
  },
  "size": 100
}'

which has nil for facets.

If I move the missing filter out to search.filter

@items = Item.search(per_page: 100, page: params[:page], load: true) do |search|
  search.query do |query|
    query.boolean do |boolean|
      boolean.must { |must| must.string params[:q], default_operator: "AND" }
      boolean.must { |must| must.term :project_id, @project.id }
    end
  end
  search.filter(:missing, :field => 'user_id' )
  search.facet('tags') do
    terms :tag
  end
end

...it makes the request:

curl -X GET 'http://localhost:9200/user_stories/user_story/_search?load=true&size=100&pretty' -d '{
  "query": {
    "bool": {
      "must": [
        {
          "query_string": {
            "query": "*",
            "default_operator": "AND"
          }
        },
        {
          "term": {
            "project_id": {
              "term": 132
            }
          }
        }
      ]
    }
  },
  "facets": {
    "tags": {
      "terms": {
        "field": "tag",
        "size": 10,
        "all_terms": false
      }
    }
  },
  "filter": {
    "missing": {
      "field": "user_id"
    }
  },
  "size": 100
}'

which does get the facets but they do not take into account the filtered records.

Is there another way I should be writing my query so I can get facets that also take into account the missing filter?


Solution

  • if I understand correctly, you're hitting the problem where facets are computed from results limited by the query, not by the top-level filter element. Also, the first query doesn't look right to me.

    I'd rewrite the request like this:

    require 'tire'
    
    params = {:q => 'foo'}
    
    s = Tire.search do |search|
      search.query do |search_query|
        search_query.filtered do |f|
    
          # Queries
          f.query do |q|
            q.string params[:q], default_operator: "AND"
          end
    
          # Filters
          f.filter :missing , :field => :user_id
          f.filter :term,     :project_id => '123'
        end
      end
      search.facet('tags') do
        terms :tag
      end
    end
    
    puts s.to_curl
    

    Basically, we use the filtered query as the main query, getting rid of the boolean query, and using filter for maximum efficiency (missing and term).

    Of course, if you'd need to perform more queries, you could use q.boolean instead of q.string.

    Lastly, the string query (ie. the Lucene query string query) is usually suboptimal choice for regular searches, since it exponses the whole "special Lucene syntax" to the user, and is error prone. The match query is usually a much better choice.