mongodbaggregation-frameworkunwind

Mongdb aggregation unwind not working as expected


I have data on mongodb like below

{
  "_id": {
    "$oid": "67d6dba348ee77f62b1527d3"
  },
  "job_code": "ACC",
  "job_id": "73337",
  "partners": [
    {
      "type": "Employee",
      "user_id": "132152"
    },
    {
      "type": "Employee",
      "user_id": "130947"
    },
    {
      "type": "Employee",
      "user_id": "132059"
    },
    {
      "type": "Employee",
      "user_id": "131235"
    },
    {
      "type": "Employee",
      "user_id": "131973"
    },
    {
      "type": "Employee",
      "user_id": "65335"
    },
    {
      "type": "Employee",
      "user_id": "101032"
    },
    {
      "type": "Employee",
      "user_id": "36598"
    },
    {
      "type": "Employee",
      "user_id": "94676"
    },
    {
      "type": "Employee",
      "user_id": "124424"
    }
  ],
  "timestamp": {
    "$date": "2024-11-15T08:20:16.000Z"
  },
  "user_id": "132059"
}

However, when I used unwind on 'partners', the resulting data will only return 1 record with the last record of the array which is "user_id": "124424".

But I am expecting 10 records to be return from unwind since the 'partners' array have 10 records. Does anyone know what is going on? I am scratching my head for an hour now

My whole query is below and I am using npm package @mongodb^6.4.0

    const res = await db
        .collection(targetCollection)
        .aggregate(
            [
                {
                    $match: {
                        job_code: {
                            $regex: `(^${jobCode}_)(LG[0-9]+|TG[0-9]+|LV[0-9]+|TV[0-9]+|L[0-9]+|T[0-9]+[^T00])$`,
                        },
                    },
                },
                {
                    $lookup: {
                        from: "partners",
                        let: {
                            job_id: "$job_id",
                        },
                        pipeline: [
                            {
                                $match: {
                                    $expr: {
                                        $and: [
                                            {
                                                $eq: ["$value.job_id", "$$job_id"],
                                            },
                                            {
                                                $eq: ["$value.type", "Employee"],
                                            },
                                            {
                                                $ne: ["$value.state", "deleted"],
                                            },
                                        ],
                                    },
                                },
                            },
                            {
                                $project: {
                                    _id: 0,
                                    type: "$value.type",
                                    user_id: "$value.user_id",
                                },
                            },
                        ],
                        as: "partners",
                    },
                },
                {
                    $unwind: "$partners",
                },
                {
                    $merge: sesColName,
                },
            ],
            {
                allowDiskUse: true,
            }
        )
        .toArray();

Solution

  • When you $unwind, all the documents share the same _id as in the original doc - as seen in Yong Shun's example. So when you $merge after that, effectively, each document with the same _id overwrites the previous. So you end up with just the last document after unwinding.

    To keep all documents, remove the _id with $unset or $project: { _id: 0 }. Then each doc gets a new auto-generated _id. However, you will always end up creating new docs every time you execute it, without updating/overwriting docs from previous executions. Depends on what outcome/result you want.

    db.collection.aggregate([
      // all the previous aggregation stages here
      {
        $unwind: "$partners"
      },
      {
        $unset: "_id"
      },
      {
        $merge: sesColName
      }
    ])
    

    Mongo Playground 1

    To avoid always creating new documents with new ids every time that aggregation is executed, you could also create consistent new id's before merging - by combing the existing _id with the partners.user_id, if that's unique in the partners array. Like this:

    db.collection.aggregate([
      // all the previous aggregation stages here
      {
        $unwind: "$partners"
      },
      {
        // create a new unique & consistent _id field
        $set: {
          _id: {
            original_id: "$_id",
            partner_id: "$partners.user_id"
          }
        }
      },
      {
        $merge: sesColName
      }
    ])
    

    Mongo Playground 2

    If keeping the original_id and partner_id separate is not important and you only need them to be unique & consistent, you could create a new String _id by just concatenating the two:

    db.collection.aggregate([
      // all the previous aggregation stages here
      {
        $unwind: "$partners"
      },
      {
        $set: {
          _id: {
            $concat: [
              // original _id and partner.user_id
              { $toString: "$_id" },
              "-",
              "$partners.user_id"
            ]
          }
        }
      },
      {
        $merge: sesColName
      }
    ])
    

    Mongo Playground 3 Result has id's like: "67d6dba348ee77f62b1527d3-132152"