javaelasticsearchelasticsearch-jest

OR and AND Operators in Elasticsearch query


I have few json document with the following format :-

    _source: {
            userId: "A1A1",
            customerId: "C1",
            component: "comp_1",
            timestamp: 1408986553,
     }

I want to query the document based on the following :-

(( userId == currentUserId) OR ( customerId== currentCustomerId) OR (currentRole ==ADMIN) )  AND component= currentComponent)

I tried using the SearchSourceBuilder and QueryBuilders.matchQuery, but I wasnt able to put multiple sub queries with AND and OR operators.

SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchQuery("userId",userId)).sort("timestamp", SortOrder.DESC).size(count);

How we query elasticsearch using OR and AND operators?


Solution

  • I think in this case the Bool query is the best shot.

    Something like :

    {
        "bool" : {
            "must" : { "term" : { "component" : "comp_1" } },
            "should" : [
                { "term" : { "userId" : "A1A1" } },
                { "term" : { "customerId" : "C1" } },
                { "term" : { "currentRole" : "ADMIN" } }
            ],
            "minimum_should_match" : 1
        }
    }
    

    Which gives in Java:

    QueryBuilder qb = QueryBuilders
        .boolQuery()
        .must(termQuery("component", currentComponent))
        .should(termQuery("userId", currentUserId))
        .should(termQuery("customerId", currentCustomerId))
        .should(termQuery("currentRole", ADMIN))
        .minimumNumberShouldMatch(1)
    

    The must parts are ANDs, the should parts are more or less ORs, except that you can specify a minimum number of shoulds to match (using minimum_should_match), this minimum being 1 by default I think (but you could set it to 0, meaning that a document matching no should condition would be returned as well).

    If you want to do more complex queries involving nested ANDs and ORs, simply nest other bool queries inside must or should parts.

    Also, as you're looking for exact values (ids and so on), maybe you can use term queries instead of match queries, which spare you the analysis phase (if those fields are analyzed at all, which doesn't necessarily make sense for ids). If they are analyzed, you still can do that, but only if you know exactly how your terms are stored (standard analyzer stores them lower cased for instance).