I'm trying to access the email body/text that was matched when searching .msg email files that are in an Azure Storage blob container. I am able get the From, To, Subject and Date Sent using:
metadata_content_type metadata_message_from metadata_message_from_email metadata_message_to metadata_message_to_email metadata_message_cc metadata_message_cc_email metadata_message_bcc metadata_message_bcc_email metadata_creation_date metadata_last_modified metadata_subject
Documented here: https://learn.microsoft.com/en-us/azure/search/search-blob-metadata-properties
How can I retrieve the body and attachment text that was matched?
Are there additional fields I can add to my index and/or indexer?
I have tried the following fields:
{
"name": "email-msg-index",
"fields": [
{"name": "ID", "type": "Edm.String", "key": true, "searchable": false},
{"name": "metadata_subject", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_content_type", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_from", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_from_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_to", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_to_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_cc", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_cc_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_bcc", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_message_bcc_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_creation_date", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false},
{"name": "metadata_last_modified", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}
]
}
I was able to obtain the content using the following index: { "name": "email-msg-index", "fields": [ {"name": "ID", "type": "Edm.String", "key": true, "searchable": false}, {"name": "metadata_subject", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_content_type", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_from", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_from_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_to", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_to_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_cc", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_cc_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_bcc", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_message_bcc_email", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_creation_date", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "metadata_last_modified", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": true, "facetable": false}, {"name": "content", "type": "Edm.String", "searchable": true, "filterable": false, "sortable": false, "facetable": false} ] }
The indexer used is: { "@odata.context": "https://<servicename>.search.windows.net/$metadata#indexers/$entity", "@odata.etag": "\"0x000000000000000\"", "name": "emailindexer", "dataSourceName": "email-blob-datasource", "targetIndexName": "email-msg-index", "parameters": { "configuration": { "indexedFileNameExtensions": ".msg", "dataToExtract": "contentAndMetadata", "parsingMode": "default" } } }
The query is: { "search": "{{search}}", "select": "metadata_subject, metadata_creation_date, metadata_message_from, metadata_message_to, content", "searchFields": "metadata_subject", "count": true }