mongodbmongodb-queryaggregation-frameworkgraphlookup

How to use graphLookup query of MongoDB to answer the question : Give me all persons connected to “X” through hotels


My documents look as below :

 { _id: 1, personId: 'Sai', pnr: "P1", flight: 'F1', hotel: 'H1' },
  { _id: 2, personId: 'Sai', pnr: "P2", flight: 'F2', hotel: 'H2' },
  { _id: 3, personId: 'Sai', pnr: "P3", flight: 'F3', hotel: 'H3' },
  { _id: 4, personId: 'Sai', pnr: "P4", flight: 'F4', hotel: 'H4' },
  { _id: 5, personId: 'Sai', pnr: "P5", flight: 'F5', hotel: 'H5' },
  { _id: 6, personId: 'PJ', pnr: "P1", flight: 'F1', hotel: 'H2' },
  { _id: 7, personId: 'PJ', pnr: "P2", flight: 'F1', hotel: 'H3' },
  { _id: 8, personId: 'Kumar', pnr: "P6", flight: 'F6', hotel: 'H1' },
  { _id: 9, personId: 'Kumar', pnr: "P7", flight: 'F7', hotel: 'H1' },
  { _id: 10, personId: 'Kumar', pnr: "P8", flight: 'F8', hotel: 'H1' },
  { _id: 11, personId: 'Kumar', pnr: "P9", flight: 'F9', hotel: 'H1' },
  { _id: 12, personId: 'Kannan', pnr: "P10", flight: 'F10', hotel: 'H1' },
  { _id: 13, personId: 'Kannan', pnr: "P11", flight: 'F11', hotel: 'H6' },
  { _id: 14, personId: 'Akansha', pnr: "P12", flight: 'F12', hotel: 'H6' },
  { _id: 15, personId: 'Akansha', pnr: "P13", flight: 'F13', hotel: 'H7' }

I want to formulate the below relationship :

Sai(H1) --> Kannan(H1, H6) --> Akansha(H6)

The inputs to my query should only be 'Sai' and 'H1'. And the graphLookup should do all the rest of the work to find the above mentioned relationships. Can someone help me with the same.

I have tried some cases, but in all of them, I have to provide another input after the first graphLookup to find the exact relationship. I do not want that to be the case because in real world, I will not know the relations between any of the documents.

The query i tried is as below :

db.person_events.aggregate([
  { $match: { personId: 'Sai', hotel: 'H1' } },
  {
    $graphLookup: {
      from: 'person_events',
      startWith: '$hotel',
      connectFromField: 'hotel',
      connectToField: 'hotel',
      as: 'hotel_connections'
    }
  },
  {
      $project: {
          hotel_connections: 1, _id: 0
      }
  },
  {
      $project: {
         hotel_connections: {
            $filter: {
               input: '$hotel_connections',
               as: 'hotelConnection',
               cond: { $eq: [ '$$hotelConnection.personId', 'Kannan' ] }
            }
         }
      }
   },
   {
    $graphLookup: {
      from: 'person_events',
      startWith: 'Kannan',
      connectFromField: 'personId',
      connectToField: 'personId',
      as: 'person_connections'
    }
  },
  {
      $project: {
          hotel_connections: 1, person_connections: 1, _id: 0
      }
  },
   {
    $graphLookup: {
      from: 'person_events',
      startWith: 'H6',
      connectFromField: 'hotel',
      connectToField: 'hotel',
      as: 'final_connections'
    }
  }
  ])

Solution

  • I don't think you can get your desired output just using $graphlookup as it'll just match wherever it finds H1 and as you didn't specify maxDepth, it'll just fetch all sub-documents but can you try this in the aggregate pipeline, I don't know if it works, just a query, I worked on after looking at your desired hotel relation requirements, replace "Sai" and "H1" with your required query inputs:

    db.person_events.aggregate([
      {
        $group: {
          _id: "$personId",
          personId: {
            $min: "$personId"
          },
          hotels: {
            $push: "$hotel"
          }
        }
      },
      {
        $match: {
          personId: {
            $ne: "Sai"
          }
        }
      },
      {
        $lookup: {
          from: "person_events",
          let: {
            personId: "$personId",
            hotel: "$hotels"
          },
          as: "hotel_connections",
          pipeline: [
            {
              $match: {
                $expr: {
                  $and: [
                    {
                      $in: [
                        "$hotel",
                        "$$hotel"
                      ]
                    },
                    {
                      $ne: [
                        "H1",
                        "$hotel"
                      ]
                    },
                    {
                      $ne: [
                        "$personId",
                        "$$personId"
                      ]
                    },
                    {
                      $ne: [
                        "Sai",
                        "$personId"
                      ]
                    }
                  ],
    
                }
              }
            },
            {
              $group: {
                _id: "$personId",
                hotel: {
                  $min: "$hotel"
                },
                personId: {
                  $min: "$personId"
                }
              }
            }
          ]
        }
      },
      {
        $match: {
          hotels: {
            $in: [
              "H1"
            ]
          },
          hotel_connections: {
            $ne: []
          }
        }
      }
    ])