How to Create a Case-Insensitive Unique Compound Index in MongoDB?
Question Body: I'm trying to create a compound unique text index in MongoDB with the following fields:
tenant_id: 1 seller_id: 1 seller_sku_code: text However, I'm encountering an error when trying to create the index: Index build failed: 4df81a22-042f-4e51-bbec-b0f8a7dafe2e: Collection product_db.skus ( 83cc8978-b0fe-4c4b-ae58-691e32db7f95 ) :: caused by :: E11000 duplicate key error collection: product_db.skus index: tenant_id_1_seller_id_1_seller_sku_code_text dup key: { tenant_id: "1", seller_id: "113", _fts: "acc", _ftsx: 0.75 }
I need to ensure that the seller_sku_code is unique within the database and is case-insensitive. For example:
"ABC" and "abc" should not be allowed simultaneously. "ACC-2001" and "ACC-2000" should be allowed to coexist.
How can I create this unique, case-insensitive index on seller_sku_code while still ensuring the uniqueness of the tenant_id and seller_id fields?
Text index is probably not what you are looking for.
"ACC-2001" and "ACC-2000" are strings, and contain some alphanumeric characters, but they are not text.
To ensure that no two documents in the entire database have the same seller_sku_code, create a unique index on just that field, which a case-insensitive collation, like:
db.collection.createIndex(
{seller_sku_code:1},
{
unique:true,
collation:
{
locale : "en_US"
strength : 1
}
}
)
Use the locale that is appropriate, and strength of 1 or 2 for case-insensitive.
The problem you were encountering with the index build is the text index tokenizer sees the dash -
as a separator, so instead of indexing the values "ACC-2001" it was indexing the values "ACC" and "2001". That would prevent any other document with the same tenant and seller_id from having a seller_sku_code starting with "ACC-".