javascriptelasticsearchmorelikethis

Why is my ElasticSeach query returning zero document?


I am trying to query an AWS ElasticSearch Domain from a Lambda worker.

To do so, I am using http-aws-es and the main javascript client for Elastic Search.

I query documents with the following relevant fields:

What I want to achieve is:

  1. Filter all documents that are not either PUBLISHED or VERIFIED or where the ref field is set
  2. Return the best matches with my keywwords argument (string array) relatively to values in field and thematics
  3. Sort to put documents with PUBLISHED status first
  4. Limit the number of results to 20

I found the more_like_this operator, and gave it a try. I build step by step my query and the actual version, at least, doesn't return an error, but no documents are returned. It still misses the ref filter + #3 and #4 from above. Here is the query :

  const client = new elasticsearch.Client({
      host: ELASTICSEARCH_DOMAIN,
      connectionClass: httpAwsEs,
      amazonES: {
        region: AWS_REGION,
        credentials: new AWS.EnvironmentCredentials('AWS')
      }
    })
    let keywords = event.arguments.keywords
    let rst = await client.search({
      body: {
        'query': {
          'bool': {
            'filter': {
              'bool': {
                'must_not': [
                  {
                    'term': {
                      'status': 'REMOVED'
                    }
                  },
                  {
                    'term': {
                      'status': 'PENDING'
                    }
                  },
                  {
                    'term': {
                      'status': 'BLOCKED'
                    }
                  }
                ]
              }
            },
            'must': {
              'more_like_this': {
                'fields': ['field', 'thematics'],
                'like': keywords,
                'min_term_freq': 1,
                'max_query_terms': 2
              },
              'should': [
                {
                  'term': {
                    'status': 'PUBLISHED'
                  }
                }
              ]
            }
          }
        }
      }

    })
    console.log(rst)
    return rst

I have to upload my lambda code to debug this and it complicates debugging a lot. Since I never made ES queries before, I wanted to have at least some hints as to how to proceed with this or know if I am misusing the ES query syntax.


EDIT:

As requested, here is my index mapping (with JS type):

Taken from AWS elastic search management console (index tabs > mappings)


Solution

  • There are one or two issues in your query (should inside must and must_not inside filter). Try the simplified query below instead:

    {
      'query': {
        'bool': {
          'must_not': [
            {
              'term': {
                'status.keyword': 'REMOVED'
              }
            },
            {
              'term': {
                'status.keyword': 'PENDING'
              }
            },
            {
              'term': {
                'status.keyword': 'BLOCKED'
              }
            }
          ],
          'must': [
            {
              'more_like_this': {
                'fields': [
                  'field',
                  'thematics'
                ],
                'like': keywords,
                'min_term_freq': 1,
                'max_query_terms': 2
              }
            }
          ],
          'should': [
            {
              'term': {
                'status.keyword': 'PUBLISHED'
              }
            }
          ]
        }
      }
    }