I'm using NEST 7.0 and C# to run search queries for Elasticsearch storage using Fluent DSL style. I use MultiMatch to search by passed string value in several number of fields:
queryContainer &= Query<Document>.MultiMatch(m => m.Fields(fields)
.Query(searchParams.SearchValue)
.Type(TextQueryType.MostFields));
For each document I receive it's _score and Source data. Both I can get from Response.Hits.
BUT how can I get the number of occurrence of the search value for the each document? I'd like to receive something like this:
Search value: "search"
Search fields: title, description
Results:
- Doc1: 5 occurrences
- Doc2: 0 occurrences
- Doc3: 3 occurrences
- Doc4: 1 occurrence
...
Thanks in advance for your help!
There is no direct way to do it in elastic search. The closest thing that can be done is to use multi-term vectors
Query
POST /index51/_mtermvectors
{
"ids" : ["1", "2"], --> Ids of all documents (_id)
"parameters": {
"fields": [
"text"
],
"term_statistics": true
}
}
It will return list of all documents with statistics for each word in the field
Result:
{
"docs" : [
{
"_index" : "index51",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"found" : true,
"took" : 3,
"term_vectors" : {
"text" : {
"field_statistics" : {
"sum_doc_freq" : 7,
"doc_count" : 3,
"sum_ttf" : 7
},
"terms" : {
"another" : {
"doc_freq" : 2,
"ttf" : 2,
"term_freq" : 1,
"tokens" : [
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 7
}
]
},
"test" : {
"doc_freq" : 3,
"ttf" : 3,
"term_freq" : 1,
"tokens" : [
{
"position" : 2,
"start_offset" : 16,
"end_offset" : 20
}
]
},
"twitter" : {
"doc_freq" : 2,
"ttf" : 2,
"term_freq" : 1,
"tokens" : [
{
"position" : 1,
"start_offset" : 8,
"end_offset" : 15
}
]
}
}
}
}
},
{
"_index" : "index51",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"found" : true,
"took" : 2,
"term_vectors" : {
"text" : {
"field_statistics" : {
"sum_doc_freq" : 7,
"doc_count" : 3,
"sum_ttf" : 7
},
"terms" : {
"test" : {
"doc_freq" : 3,
"ttf" : 3,
"term_freq" : 1,
"tokens" : [
{
"position" : 0,
"start_offset" : 0,
"end_offset" : 4
}
]
}
}
}
}
}
]
}
Ids of all documents can be fetched using scroll api