I am trying to add a custom skillset to my indexer. I am however facing some issues which I don't know how to solve. My skills:
# https://learn.microsoft.com/en-us/python/api/azure-search-documents/azure.search.documents.indexes.models.splitskill?view=azure-python
split_skill = SplitSkill(
name="split",
description=None,
context="/document/reviews_text",
inputs=[{"name": "text", "source": "/document/content"}],
outputs=[{"name": "textItems", "targetName": "pages"}],
text_split_mode="pages",
default_language_code="en",
maximum_page_length=1000
)
# https://learn.microsoft.com/en-us/python/api/azure-search-documents/azure.search.documents.indexes.models.azureopenaiembeddingskill?view=azure-python-preview
text_embedding_skill = AzureOpenAIEmbeddingSkill(
name="embedding",
description=None,
inputs=[{"name": "text", "source": "/pages"}],
outputs=[{"name": "embeddings", "targetName": "embeddings"}],
resource_uri="xxx",
deployment_id="yyy",
api_key="zzz",
)
How they are added to an index
# Define the skillset with the text embedding skill
skillset = SearchIndexerSkillset(
name="my-text-embedding-skillset",
skills=[split_skill, text_embedding_skill],
description="A skillset for creating text embeddings"
)
The error I get is:
azure.core.exceptions.HttpResponseError: () One or more skills are invalid. Details: Error in skill 'embedding': Outputs are not supported by skill: embeddings
Code:
Message: One or more skills are invalid. Details: Error in skill 'embedding': Outputs are not supported by skill: embeddings
Should I create the index first and match the output name to the one in the index? I would be grateful if anybody could point to a good tutorial on this :)
The name
in the outputs for the AzureOpenAIEmbeddingSkill needs to be the singular embedding
, as documented here. You can make the targetName
whatever you want (to use in later skills/mapping to the index), but the name
needs to be what is documented.