I have this document:
{
"_id": "65b294740c5693e3deac0cea",
"name": "Islamic Republic Of Iran",
"population": 95000000,
"abb": "IRI",
"cities": [
{
"_id": "65b294740c5693e3deac0cf6",
"name": "Hamedan",
"population": 10,
"abb": "HM"
},
{
"_id": "65b294750c5693e3deac0cf7",
"name": "Tehran",
"population": 50,
"abb": "TH"
},
{
"_id": "65b294750c5693e3deac0cf8",
"name": "Kerman",
"population": 12,
"abb": "KM"
},
{
"_id": "65b294750c5693e3deac0cf9",
"name": "Esfahan",
"population": 25,
"abb": "ES"
},
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"population": 28,
"abb": "MS"
}
],
}
and I want to update the document and save 5 unique cities on the cities
field by the following array:
[
{
_id: "65b294750c5693e3deac0cfa",
name: "Mashhad",
abb: "MS",
population: 28
},
{
_id: "65b294750c5693e3deac0cfb",
name: "Tabriz",
abb: "TB",
population: 27
}
]
If you pay attention, the order of the abb
and population
fields in this array is not the same as the first array.
I use this update pipeline to update the cities
field:
[
{
"$set": {
"cities": {
"$filter": {
"input": "$cities",
"as": "citiesItem",
"cond": {
"$ne": [
"$$citiesItem._id",
"65b294750c5693e3deac0cf8"
]
}
}
}
}
},
{
"$set": {
"cities": {
"$setUnion": [
"$cities",
[
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"abb": "MS",
"population": 28
},
{
"_id": "65b294750c5693e3deac0cfb",
"name": "Tabriz",
"abb": "TB",
"population": 27
}
]
]
}
}
},
{
"$set": {
"cities": {
"$sortArray": {
"input": "$cities",
"sortBy": {
"_id": 1
}
}
}
}
},
{
"$set": {
"cities": {
"$slice": [
"$cities",
5
]
}
}
}
]
I expect this object:
{
"_id": "65b294750c5693e3deac0cf8",
"name": "Kerman",
"population": 12,
"abb": "KM"
}
to be erased and this object:
{
_id: "65b294750c5693e3deac0cfb",
name: "Tabriz",
abb: "TB",
population: 27
}
is added to the end of the array of cities
.
But due to the different order of the fields, instead of the city of "Tabriz", the city of "Mashhad" is stored twice at the end of the cities
field as follows:
{
"_id": "65b294740c5693e3deac0cea",
"name": "Islamic Republic Of Iran",
"population": 95000000,
"abb": "IRI",
"cities": [
{
"_id": "65b294740c5693e3deac0cf6",
"name": "Hamedan",
"population": 10,
"abb": "HM"
},
{
"_id": "65b294750c5693e3deac0cf7",
"name": "Tehran",
"population": 50,
"abb": "TH"
},
{
"_id": "65b294750c5693e3deac0cf9",
"name": "Esfahan",
"population": 25,
"abb": "ES"
},
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"population": 28,
"abb": "MS"
},
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"abb": "MS",
"population": 28
}
],
}
I even tried switching the order of the fields with the pipeline
below:
[
{
"$set": {
"cities": {
"$filter": {
"input": "$cities",
"as": "citiesItem",
"cond": {
"$ne": [
"$$citiesItem._id",
"65b294750c5693e3deac0cf8"
]
}
}
}
}
},
{
"$set": {
"cities": {
"$setUnion": [
{
"$project": {
"cities": {
"$map": {
"input": "$cities",
"as": "orderedItemcities",
"in": {
"_id": "$$orderedItemcities._id",
"name": "$$orderedItemcities.name",
"population": "$$orderedItemcities.population",
"abb": "$$orderedItemcities.abb"
}
}
}
}
},
[
{
"_id": "65b2976346bd44878b4ae02b",
"name": "Mashhad",
"abb": "MS",
"population": 28
},
{
"_id": "65b2976346bd44878b4ae02c",
"name": "Tabriz",
"abb": "TB",
"population": 27
}
]
]
}
}
},
{
"$set": {
"cities": {
"$sortArray": {
"input": "$cities",
"sortBy": {
"_id": 1
}
}
}
}
},
{
"$set": {
"cities": {
"$slice": [
"$cities",
5
]
}
}
}
]
But I got this error:
Invalid $set : : caused by : : Unrecognized expression '$project'
How can I create a proper pipeline for this?
Is there a way to uniquely combine two arrays in MongoDB based on one of the keys of an object?
For example, should we perform this combination according to the value of the _id
key?
Based on your second query, what did I fix:
Remove the $project
and cities
as not needed.
The position of the abb
field should come first before the population
.
The "Mashhad" _id
should be "65b294750c5693e3deac0cfa" according to the document.
db.collection.update({},
[
{
"$set": {
"cities": {
"$filter": {
"input": "$cities",
"as": "citiesItem",
"cond": {
"$ne": [
"$$citiesItem._id",
"65b294750c5693e3deac0cf8"
]
}
}
}
}
},
{
"$set": {
"cities": {
"$setUnion": [
{
"$map": {
"input": "$cities",
"as": "orderedItemcities",
"in": {
"_id": "$$orderedItemcities._id",
"name": "$$orderedItemcities.name",
"abb": "$$orderedItemcities.abb",
"population": "$$orderedItemcities.population"
}
}
},
[
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"abb": "MS",
"population": 28
},
{
"_id": "65b2976346bd44878b4ae02c",
"name": "Tabriz",
"abb": "TB",
"population": 27
}
]
]
}
}
},
{
"$set": {
"cities": {
"$sortArray": {
"input": "$cities",
"sortBy": {
"_id": 1
}
}
}
}
},
{
"$set": {
"cities": {
"$slice": [
"$cities",
5
]
}
}
}
])
Demo Solution 1 @ Mongo Playground
If you are looking alternative to check the _id
from the provided input array that exists in the current document and only add when it is not exist, you can use the below query:
db.collection.update({},
[
{
"$set": {
"cities": {
"$filter": {
"input": "$cities",
"as": "citiesItem",
"cond": {
"$ne": [
"$$citiesItem._id",
"65b294750c5693e3deac0cf8"
]
}
}
}
}
},
{
"$set": {
"cities": {
$reduce: {
input: [
{
"_id": "65b294750c5693e3deac0cfa",
"name": "Mashhad",
"abb": "MS",
"population": 28
},
{
"_id": "65b294750c5693e3deac0cfb",
"name": "Tabriz",
"abb": "TB",
"population": 27
}
],
initialValue: "$cities",
in: {
$concatArrays: [
"$$value",
{
$ifNull: [
{
$cond: {
if: {
$not: {
$in: [
"$$this._id",
"$cities._id"
]
}
},
then: [
"$$this"
],
else: null
}
},
[]
]
}
]
}
}
}
}
},
{
"$set": {
"cities": {
"$sortArray": {
"input": "$cities",
"sortBy": {
"_id": 1
}
}
}
}
},
{
"$set": {
"cities": {
"$slice": [
"$cities",
5
]
}
}
}
])