arraysmongodbaggregation-frameworklookup

MongoDB lookup on two field with condition of third field (array)


I have these two MongoDB collections:

Collection A

{ "_id": 1, "field1": "A", "field2": "", "value": 42 }
{ "_id": 2, "field1": "B", "field2": "", "value": 99 }
{ "_id": 3, "field1": "C", "field2": "", "value": 15 }
{ "_id": 4, "field1": "C", "field2": "Z", "value": 15 }

Collection B

{ "_id": 1, "field1": "A", "field2": "", "arrayField": [10, 42, 73] }
{ "_id": 2, "field1": "B", "field2": "", "arrayField": null }
{ "_id": 3, "field1": "C", "field2": "", "arrayField": null }
{ "_id": 4, "field1": "C", "field2": "Z", "arrayField": [99, 15] }

I need to join B on A based on field1, field2 and value-arrayField; the first two have to always match, but the third condition is only needed when arrayField is not empty, otherwise the documents can be disambiguated by the first two. This is what I've arrived at working in Compass:

[
  {
    $lookup: {
      from: "Collection B",
      let: {
        value: "$value",
        field1: "$field1",
        field2: "$field2"
      },
      pipeline: [
        {
          $match: {
            $expr: {
             $and: [
               {$eq: ["$field1","$$field1"]},
               {$eq: ["$field2","$$field2"]},
               {$cond: [
                 {$ifNull: ["$arrayField",false]},
                 {$in: ["$$value","$arrayField"]},
                 true
               ]}
             ] 
            }
          }
        }
      ],
      as: "result"
    }
  }
]

This is close enough, but it only matches documents in B where the arrayField is NOT empty. Simple matches between the other two fields are ignored. How can I prevent this behaviour? Many thanks in advance for any advice.


Solution

  • You are actually very close, in terms of identifying the null case in the comment. You just have to wrap it with $ifNull to fall back both null case and non-existent field case to both empty array and check for the array size.

    Repost of the playground in comment:

    db.collA.aggregate([
      {
        $lookup: {
          from: "collB",
          let: {
            v: "$value",
            f1: "$field1",
            f2: "$field2"
          },
          pipeline: [
            {
              $match: {
                $expr: {
                  $and: [
                    {
                      $eq: [
                        "$field1",
                        "$$f1"
                      ]
                    },
                    {
                      $eq: [
                        "$field2",
                        "$$f2"
                      ]
                    },
                    {
                      "$cond": {
                        "if": {
                          $gt: [
                            {
                              "$size": {
                                "$ifNull": [
                                  "$arrayField",
                                  []
                                ]
                              }
                            },
                            0
                          ]
                        },
                        "then": {
                          $in: [
                            "$$v",
                            "$arrayField"
                          ]
                        },
                        "else": true
                      }
                    }
                  ]
                }
              }
            }
          ],
          as: "result"
        }
      }
    ])
    

    Mongo Playground