I have an Azure Cognitive Search index which indexes data from multiple data sources. Each data source is indexed with a near identical indexer. Each indexer calls the same skillset configuration.
Within the index definition I have a field labeled "datasource"
which is intended to identify the data source for a particular document. I would like to have the indexer or use a modular skill, such as a conditional skill, to set the value of this field based on the data source. I understand it is possible to use a conditional skill to the value of a field if a value is not found, but I want to avoid having to create a new skillset for every indexer. My data sources are documents of multiple types in blob containers.
Using only the indexer definition is is possible to assign the value of a field to a string manually in the definition, by somehow extracting the name of the data source, or using a modular skill in the skillset definition?
An avenue I have been pursuing is setting user-specified blob metadata at the container level. However, I have not been able to successfully retrieve this information with either the indexer or skillset. I do not want to set this user-specified blob metadata on every single blob in a container.
Unfortunately it is not possible to configure a blob data source in a way that will pass unique information to the skillset. Having a separate skillset per datasource may be the cleanest option. Alternatively, you could pass metadata_storage_path to a custom skill and parse the container path to return a value by convention or mapping.