I am new to Elastic Search, so any tips or hints are appreciated!
I have an index where I want to retrieve some entries that match exactly some values of the field called "my_id".
These are my attempts: Query 1:
{
"min_score": 1,
"query": {
"bool": {
"must": [
{
"terms": {
"project": ["one"]
}
}
],
"filter": [
{ "terms": { "my_id": ["my_id_2", "my_id_1"] } }
]
}
}
}
-> returns empty hits.
Query 2:
{
"min_score": 1,
"query": {
"bool": {
"must": [
{
"terms": {
"project": ["one"]
}
}
]
, "should": [
{
"match": {
"my_id": "my_id_1"
}
},
{
"match": {
"my_id": "my_id_2"
}
}
]
}
}
}
-> returns more entries than the entries with given my_id.
I am at loss here and do not see what could be wrong. Do I have to have a look at analyzers? If it is text or keyword based setup? If so, how do you check these and change these?
Thanks in advance for every answer and comment!
Edit:
This is the field type:
"my_id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
In your second code snippet:
when you use "should" and "must," of course, you will get more results because "should" is not necessary to be true; only "must" is enough to retrieve the data. For example, when you use "must project = 'one'", it means it will fetch everything with project = "one" regardless of what you have inside "should". "Should" only adds extra score; it does not filter anything.
In the first code snippet:
nothing appears wrong; the query seems to be entirely accurate. What you are doing is first matching all the data that have (project == "one" && (my_id == "my_id_1" || my_id == "my_id_2")). This is exactly what is happening, and it should work. If you are not getting results, it could be due to one of two reasons: either remove the "min_score": 1, or ensure that you have data in your database with project = "one" and my_id = "my_id_1" or "my_id_2" because the query have to retrieve data, plus there is nothing wrong with the analyzers.
Furthermore, there are a few more things you should know that I felt you may not be aware of:
Filter and Must are doing exactly the same thing, the only difference is that the filter is not giving a score to the results.
Should is not doing anything, it is only adding an extra score to the results. meaning that if you remove should the result will be the same but with less score only.
Term and Match are almost the same, the difference is that term is searching according to the exact matching ("data" == "data"), but match is searching according to the text analyzers meaning that you can search for part of the sentence, ("data" == "data in the database")
(AND), (OR), (Relationship) logic in the query:
I will try to explain this part as much as I can, I will write code and query comparison:
First example:
code:
if(a == 1 && b == 2){ getdata(); }
query:
"query": {
"bool": {
"must": [
{
"term": { "a": 1 },
"term": { "b": 2 }
}
]
}
}
meaning that {term},{term} this relation is (AND)
Second example:
code:
if(a == 1 || b == 2){ getdata(); }
query:
"query": {
"bool": {
"must": [
{
"terms": { "a": [1,2] }
}
]
}
}
meaning that {terms[1,2]} this relation is (OR)
Third example:
code:
Doc dataArray1[];
Doc dataArray2[];
if(a==1) {
dataArray1 = getdata();
}
if(b==2) {
dataArray2 = getdata();
}
intersection(dataArray1, dataArray2);
query:
"query": {
"bool": {
"must": [
{
"term": { "a": 1 }
}
],
"filter" : [
{
"term": { "b": 2 }
}
]
}
}
meaning that the relationship between any bool query is an intersection, but we will exclude the should.
Fourth example:
code:
Doc dataArray1[];
Doc dataArray2[];
Doc dataArray3[];
Doc res[];
if(a==1) {
dataArray1 = getdata();
}
if(b==2) {
dataArray2 = getdata();
}
if(c==3) {
dataArray3 = getdata();
}
res = intersection(dataArray1, dataArray2);
for(Doc d : dataArray3) {
res[d].score++;
}
query:
"query": {
"bool": {
"must": [
{
"term": { "a": 1 }
}
],
"filter" : [
{
"term": { "b": 2 }
}
],
"should" : [
{
"term": { "c": 3 }
}
]
}
}
meaning that should is only adding a score, nothing more.
the code is not something to run, it is only to explain the idea or the logic behind each one, it is not necessary that the backend of Elasticsearch works like that, I only made these examples to demonstrate and to make it easier to understand