I am new to Elasticsearch and have a synonym analyzer in place which looks like-
{
"settings": {
"index": {
"analysis": {
"filter": {
"graph_synonyms": {
"type": "synonym_graph",
"synonyms": [
"gowns, dresses",
"backpacks, bags",
"coats, jackets"
]
}
},
"analyzer": {
"search_time_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"graph_synonyms"
]
}
}
}
}
}
}
And the mapping looks like-
{
"properties": {
"category": {
"type": "text",
"search_analyzer": "search_time_analyzer",
"fields": {
"no_synonyms": {
"type": "text"
}
}
}
}
}
If I search for gowns, it gives me proper results for both gowns as well as dresses.
But the problem is if I search for red gowns, (the system does not have any red gowns) the expected behavior is to search for red dresses and return those results. But instead, it returns results of gowns and dresses irrespective of the color.
I would want to configure the system such that it considers both the terms and their respective synonyms if any and then return the results.
For reference, this is what my search query looks like-
"query":
{
"bool":
{
should:
[
{
"multi_match":
{
"boost": 300,
"query": term,
"type": "cross_fields",
"operator": "or",
"fields": ["bu.keyword^10", "bu^10", "category.keyword^8", "category^8", "category.no_synonyms^8", "brand.keyword^7", "brand^7", "colors.keyword^2", "colors^2", "size.keyword", "size", "hash.keyword^2", "hash^2", "name"]
}
}
]
}
}
Sample document:
_source: {
productId: '12345',
name: 'RUFFLE FLORAL TRIM COTTON MAXI DRESS',
brand: [ 'self-portrait' ],
mainImage: 'http://test.jpg',
description: 'Self-portrait presents this maxi dress, crafted from cotton, to offer your off-duty ensembles an elegant update. Trimmed with ruffled broderie details, this piece is an effortless showcase of modern femininity.',
status: 'active',
bu: [ 'womenswear' ],
category: [ 'dresses', 'gowns' ],
tier1: [],
tier2: [],
colors: [ 'WHITE' ],
size: [ '4', '6', '8', '10' ],
hash: [
'ballgown', 'cotton',
'effortless', 'elegant',
'floral', 'jar',
'maxi', 'modern',
'off-duty', 'ruffle',
'ruffled', '1',
'2', 'crafted'
],
styleCode: '211274856'
}
How can I achieve the desired output? Any help would be appreciated. Thanks
You can configured index time analyzer insted of search time analyzer like below:
{
"properties": {
"category": {
"type": "text",
"analyzer": "search_time_analyzer",
"fields": {
"no_synonyms": {
"type": "text"
}
}
}
}
}
Once you done with index mapping change, reindex your data and try below query:
Please note that I have changed operator
to and
and analyzer
to standard
:
{
"query": {
"multi_match": {
"boost": 300,
"query": "gowns red",
"analyzer": "standard",
"type": "cross_fields",
"operator": "and",
"fields": [
"category",
"colors"
]
}
}
}
Why your current query is not working:
Inexing:
Your current index mapping indexing data with standard
analyzer so it will not index any of your category with synonyms values.
Searching:
Your current query have operator or
so if you search for red gowns
then it will create query like red OR gowns OR dresses
and it will giving you result irrespective of the color. Also, if you change operator
to and
in existing configuration then it will return zero result as it will create query like red AND gowns AND dresses
.
Solution: Once you done changes as i suggsted it will index synonyms for category
field as well and it will work with and
operator. So if you try query gowns red
then it will create query like gowns AND red
. It will match because category
field have both values gowns
and dresses
due to synonyms applied at index time.