pythonmongodbaggregation-framework

Mongodb $loopkup: Lookup only those documents which satisfy a condition


I have two collections, tasks and users. The users collection has 3 important fields user_id, parent_user_id and details. tasks has user_id field which connects it with users.

The requirement of an API is to attach the relevant users.details with the tasks document and it has to be matched with parent_user_id. For this I am using $lookup to match and put the relevant user_details, but since sometimes the parent_user_id can be different to the user_id I have to use 2 lookups.

The first $lookup attaches the parent_user_id for the user_id in the tasks collection and the next $lookup attaches the details using the extracted parent_user_id.

My current pipeline:

[
{
    '$lookup': {
        'from': 'users', 
        'let': {
            'uid': '$user_id', 
            'pid': '$project_id'
        }, 
        'pipeline': [
            {
                '$match': {
                    '$expr': {
                        '$and': [
                            {
                                '$eq': [
                                    '$project_id', '$$pid'
                                ]
                            }, {
                                '$eq': [
                                    '$user_id', '$$uid'
                                ]
                            }
                        ]
                    }
                }
            }, {
                '$project': {
                    '_id': 0, 
                    'parent_user_id': 1
                }
            }
        ], 
        'as': 'parent_info'
    }
}, {
    '$replaceRoot': {
        'newRoot': {
            '$mergeObjects': [
                {
                    '$arrayElemAt': [
                        '$parent_info', 0
                    ]
                }, '$$ROOT'
            ]
        }
    }
}, {
    '$project': {
        'parent_info': 0
    }
}, {
    '$lookup': {
        'from': 'users', 
        'let': {
            'pid': '$project_id', 
            'uid': '$parent_user_id'
        }, 
        'pipeline': [
            {
                '$match': {
                    '$expr': {
                        '$and': [
                            {
                                '$eq': [
                                    '$project_id', '$$pid'
                                ]
                            }, {
                                '$eq': [
                                    '$user_id', '$$uid'
                                ]
                            }
                        ]
                    }
                }
            }, {
                '$project': {
                    '_id': 0, 
                    'details': 1
                }
            }
        ], 
        'as': 'user_info'
    }
}, {
    '$replaceRoot': {
        'newRoot': {
            '$mergeObjects': [
                {
                    '$arrayElemAt': [
                        '$user_info', 0
                    ]
                }, '$$ROOT'
            ]
        }
    }
}, {
    '$project': {
        'user_info': 0, 
        'project_id': 0
    }
}

]

This is working fine. The issue is most of the times the parent_user_id is same as user_id so I can just get the details in the first lookup itself and totally ignore the second lookup, but I am unable to acheive it.

Any help would be appreciated.


Solution

  • You can achieve this by adding is_same_user field in the $addFields stage and then filter out those users with the $match stage. Here is a full query:

    db.tasks.aggregate([
      {
        "$lookup": {
          "from": "users",
          "let": {
            "uid": "$user_id",
            "pid": "$project_id"
          },
          "pipeline": [
            {
              "$match": {
                "$expr": {
                  "$and": [
                    {
                      "$eq": [
                        "$project_id",
                        "$$pid"
                      ]
                    }
                  ]
                }
              }
            },
            {
              "$addFields": {
                "is_same_user": {
                  "$eq": [
                    "$user_id",
                    "$$uid"
                  ]
                },
                "lookup_user_id": {
                  "$ifNull": [
                    "$parent_user_id",
                    "$user_id"
                  ]
                }
              }
            },
            {
              "$match": {
                "$expr": {
                  "$or": [
                    {
                      "$and": [
                        {
                          "$eq": [
                            "$is_same_user",
                            true
                          ]
                        },
                        {
                          "$eq": [
                            "$user_id",
                            "$$uid"
                          ]
                        }
                      ]
                    },
                    {
                      "$eq": [
                        "$lookup_user_id",
                        "$$uid"
                      ]
                    }
                  ]
                }
              }
            },
            {
              "$project": {
                "_id": 0,
                "parent_user_id": 1,
                "details": 1
              }
            }
          ],
          "as": "user_info"
        }
      },
      {
        "$replaceRoot": {
          "newRoot": {
            "$mergeObjects": [
              {
                "$arrayElemAt": [
                  "$user_info",
                  0
                ]
              },
              "$$ROOT"
            ]
          }
        }
      },
      {
        "$project": {
          "user_info": 0,
          "project_id": 0
        }
      }
    ])
    

    Here is a link to MongoDB playground to see the query work: https://mongoplayground.net/p/LZlM9FnVs5U