elastic search is created with following body
body = {
"mappings": {
"properties": {
"TokenizedDocumentFileName": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "standard"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"filter": ["word_delimiter",
"lowercase"
]
}
}
},
"number_of_shards": "1",
"number_of_replicas": "0"
}
}
Now there below is 2 different metadata in elasticsearch
{'_index': 'fileboundunitmanuals',
'_type': '_doc',
'_id': '997439.PDF',
'_version': 2,
'_seq_no': 166958,
'_primary_term': 1,
'found': True, '_source': {
'IndexKey': '997439.PDF',
'DocumentID': 997439,
'Extension': 'PDF',
'FileID': 174508,
'DocumentFileName': '\\UNIT xxxxx\\411xxx\\A9.xxxx_xxxxx GAS ENGINE xxxxx_x_997439.PDF',
'TokenizedDocumentFileName': '\\UNIT xxxxx\\411xxx\\A9. xxxx xxxxx GAS ENGINE xxxxx x 997439.PDF',
'F1': 'UNIT xxxxx',
'ProjectID': 8}}
2nd record
{'_index': 'fileboundunitmanuals',
'_type': '_doc',
'_id': '3929829.pdf',
'_version': 1,
'_seq_no': 538517,
'_primary_term': 3,
'found': True, '_source': {
'Extension': 'pdf',
'DocumentID': 3929829,
'IndexKey': '3929829.pdf',
'FileID': '',
'DocumentFileName': '\\Unit xxxxx\\Mary Testing\\marynewfiletest.pdf',
'TokenizedDocumentFileName': '\\Unit xxxxx\\Mary Testing\\marynewfiletest.pdf',
'F1': 'Unit xxxxx',
'ProjectID': 8}}
now when searching in elasticsearch using following query for 1st record
{
"query":{
"bool":{
"must":{
"match":{
"TokenizedDocumentFileName":{
"query":"997439"
}
}
},
"filter":{
"bool":{
"must":[
{
"term":{
"ProjectID":8
}
},
{
"term":{
"Extension":"pdf"
}
}
]
}
}
}
}
}
query to search for 2nd record
{
"query":{
"bool":{
"must":{
"match":{
"TokenizedDocumentFileName":{
"query":"marynewfiletest"
}
}
},
"filter":{
"bool":{
"must":[
{
"term":{
"ProjectID":8
}
},
{
"term":{
"Extension":"pdf"
}
}
]
}
}
}
}
}
first query is giving me the right result , since query "997439" is present in TokenizedDocumentFileName , but when I am searching marytesting for 2 records I am getting following respone.
{'took': 0, 'timed_out': False, '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0}, 'hits': {'total': {'value': 0, 'relation': 'eq'}, 'max_score': None, 'hits': []}}
But when I am giving filename along with extension i.e "marytesting.pdf", in this case I am getting the right result.
OUTPUT of GET fileunitmanuals
{
"fileboundunitmanuals" : {
"aliases" : { },
"mappings" : {
"properties" : {
"DocumentFileName" : {
"type" : "text",
"analyzer" : "my_analyzer",
"search_analyzer" : "standard"
},
"DocumentID" : {
"type" : "long"
},
"Extension" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"F1" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"FileID" : {
"type" : "long"
},
"IndexKey" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"ProjectID" : {
"type" : "long"
},
"TokenizedDocumentFileName" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"settings" : {
"index" : {
"number_of_shards" : "1",
"provided_name" : "fileboundunitmanuals",
"creation_date" : "1607069298331",
"analysis" : {
"analyzer" : {
"my_analyzer" : {
"filter" : [
"word_delimiter",
"lowercase"
],
"tokenizer" : "keyword"
}
}
},
"number_of_replicas" : "0",
"uuid" : "u8HasYfVT6iMr7XGpdjJHg",
"version" : {
"created" : "7090199"
}
}
}
}
}
So the question is why partialsearch is working for the 1st one and not for the second one.
According to your mapping, the field TokenizedDocumentFileName
is just text
and keyword
, so it doesn't have your analyzers. So it's just a coincidence that your first query works.
You should make sure to properly create your index with the right mapping before indexing your first document.
PS: I was able to create your index with the settings/mappings you gave and I got the expected result for the both queries, so you're almost there.