When indexing a document OpenSearch has the option to perform a POST to the index/_doc API for creating it, automatically generating an document ID if not informed.
Is there a configuration option to disable it, making so the only option is to explicitly inform a document ID when creating?
OpenSearch does not provide a built-in configuration option to enforce the explicit specification of document IDs when indexing documents. By default, when you use the POST /index/_doc
API without providing an ID, OpenSearch automatically generates one for you.
In Elasticsearch or Opensearch you can use an ingest pipeline to prevent document creation if the _id
field is not exist.
PUT _ingest/pipeline/require_id_pipeline
{
"description": "Ensure document has an _id field, reject indexing if not",
"processors": [
{
"script": {
"source": """
if (ctx._id == null || ctx._id == '') {
throw new IllegalArgumentException('Document must have an _id');
}
"""
}
}
]
}
POST _bulk?pipeline=require_id_pipeline
{ "index": { "_index": "test_exception", "_id": 1 } }
{ "title": "Rush", "year": 2013 }
{ "index": { "_index": "test_exception", "_id": "2" } }
{ "title": "Prisoners", "year": 2013 }
{ "index": { "_index": "test_exception" } }
{ "doc" : { "title": "World War Z" } }
#add the following to update the default pipeline. In that way, you don't need to provide ?pipeline in each POST request.
PUT test_exception/_settings
{"default_pipeline":"require_id_pipeline"}