mongodbaggregatelookup

Mongo DB Aggregate => Lookup => combine lookup data property into root property array


Lot's a questions on Stack Overflow regarding Mongo DB, Aggregate, and lookup. However, I could not find direct answers to the issue I am struggling with.

Suppose we have two collections: Support and Images. The Support document has a property called instructions that is an array of objects, which each contain an image property that is the string value of the images collection id. During the aggregation, the support document is found during match (returning one support document), then I need the instructions array objects to each get their "image" property replaced with the found images from the lookup, and ultimately return the root support object with only the instrucitons.image property changed.

No new properties added, only use the instructions.image tag from each object in the instructions array to get the correct images, then replace those strings with the actual image value from the images array.

  1. How is this done?
  2. Is there any way to get access to the "localfield" in the lookup without using "let" if I was to use a lookup pipeline?
  3. Could this all be done inside of a lookup pipeline?
  4. What is the most efficient way of doing this?
    DB.support = [
    {
        _id: ObjectId("12345"),
        title: "Support",
        instructions: [
            {
                image: "image-id-string-one"
            },
            {
                image: "image-id-string-two"
            },
            {
                image: "image-id-string-three"
            },
        ]
    }]

    DB.images = [
    {
        _id: ObjectId("11111111"),
        id:"image-id-string-one",
        image: "actual-image-one"
    },
    {
        _id: ObjectId("222222222"),
        id:"image-id-string-two",
        image: "actual-image-two"
    },
    {
        _id: ObjectId("333333333"),
        id:"image-id-string-three",
        image: "actual-image-three"
    }]

    db.collection('support').aggregate([
    {
        "$match": {
            "title": "Support"
        }
    },
    // Let's say we match on title and expect only one record from above match
    {
        "$unwind": "$instructions"
    },
    {
        "$lookup": {
            "from": "images",
            "localField": "instructions.image",
            "foreignField": "id",
            "as": "images"
        }
    },
    {
        "$unwind": "$images"
    },
    {
        "$set": {
            "instructions.image": "$images.image"
        }
    },
    // Remove images array artifact from lookup from object.
    {
        "$unset": ["images"]
    }])

    // So that the return object would look like this

    {
    _id: ObjectId("12345"),
    title: "Support",
    instructions: [
        {
            image: "actual-image-one"
        },
        {
            image: "actual-image-two"
        },
        {
            image: "actual-image-three"
        },
    ]}

// NOTE: I have tried so many ways to try to get to the above object but nothing.

Solution

  • You were almost there.

    The pipeline you show would result in a separate support document for each instruction, with the image field already replaced with the desired value.

    From that point you would need to:

    This might look like:

    {$group: {
          _id: "$_id",
          doc: {$first: "$$ROOT"},
          instructions: {$push: "$instructions"}
    }},
    {$replaceRoot: {
          newRoot: {"$mergeObjects": ["$doc",{instructions: "$instructions"}]}
    }}
    

    Playground

    Result from adding those 2 stages:

    [
      {
        "_id": ObjectId("5a934e000102030405000003"),
        "instructions": [
          {"image": "actual-image-one"      },
          {"image": "actual-image-two"},
          {"image": "actual-image-three"}
        ],
        "title": "Support"
      }
    ]
    

    Edit: Adding some additional explanation

    In an aggregation pipeline there are some predefined system variables, accessed with the $$ prefix, listed at: https://www.mongodb.com/docs/manual/reference/aggregation-variables/#system-variables

    The $$ROOT variable refers to the current document as it was at the start of the current stage.

    In the $group stage, the $first operator means take the value from only the first document in the group, so doc: {$first: "$$ROOT"} means to assign a new field named "doc" the entire first document into the group. For this example, that means it is saving a copy of all of the fields so that we can include them in the final document later, i.e. doc is a temporary field.

    The $replaceRoot stage takes an object with a single field named "newRoot". Each existing document in the pipeline is discarded, and the value of the newRoot field is passed along the pipeline as the new document.

    The $mergeObjects operator takes an array of documents and combines their fields to form a new documents. If any fields occur in multiple documents in the array, the last value will be the one used.

    In this case we create a new object with a single field instructions, to hold the array that was built in the $group stage, and merge that into the document in the doc field, which contains the original full document (which had only a single instruction).