mongodbmongodb-queryaggregation-frameworkpymongoaggregate-functions

MongoDB: Map array of ids to objects while preserving order via aggregation functions


I am using MongoDB to create a playlist maker. I have two separate collections:

Playlists

{
  "_id": {
    "$oid": "65e7a3006b0209462d08a5f5"
  },
  "playlist_name": "name of playlist",
  "user_id": {
    "$oid": "65e3544e6b0209462d87f905"
  },
  "time": {
    "$date": "2024-03-05T17:56:00.286Z"
  },
  "tracks": [
    {
      "$oid": "65e7a2bd6b0209462d07d1d4"
    }
  ]
}

Tracks

{
  "_id": {
    "$oid": "65e7a2bd6b0209462d07d1d4"
  },
  "track_id": "500h8jAdr7LvzzXlm1qxtK",
  "album_name": "From The Album Of The Same Name",
  "analysis": {
    "danceability": 0.393,
    "energy": 0.588,
    "key": 9,
    "loudness": -6.68,
    "mode": 0,
    "speechiness": 0.0613,
    "acousticness": 0.345,
    "instrumentalness": 0,
    "liveness": 0.134,
    "valence": 0.728,
    "tempo": 203.145,
    "analysis_url": "https://api.spotify.com/v1/audio-analysis/500h8jAdr7LvzzXlm1qxtK",
    "duration_ms": 186333,
    "time_signature": 4
  },
  "artist_name": "Pilot",
  "spotify": {
    "track_id": "500h8jAdr7LvzzXlm1qxtK",
    "uri": "spotify:track:500h8jAdr7LvzzXlm1qxtK",
    "track_href": "https://api.spotify.com/v1/tracks/500h8jAdr7LvzzXlm1qxtK"
  },
  "track_name": "Magic"
}

I want to have an output like this but for the function to preserve the current order of the tracks array:

{
  "_id": {
    "$oid": "65e7a3006b0209462d08a5f5"
  },
  "playlist_name": "songs-for-events",
  "user_id": {
    "$oid": "65e3544e6b0209462d87f905"
  },
  "time": {
    "$date": "2024-03-05T17:56:00.286Z"
  },
  "tracks": [
    {
      "_id": {
        "$oid": "65e7a2bf6b0209462d07d8a9"
      },
      "track_id": "4wkQmYpAaMe41Rc3sYZ7Vz",
      "album_name": "18 Months",
      "analysis": {
        "danceability": 0.71,
        "energy": 0.882,
        "key": 4,
        "loudness": -2.932,
        "mode": 0,
        "speechiness": 0.0595,
        "acousticness": 0.00777,
        "instrumentalness": 0.00771,
        "liveness": 0.294,
        "valence": 0.875,
        "tempo": 128.016,
        "analysis_url": "https://api.spotify.com/v1/audio-analysis/4wkQmYpAaMe41Rc3sYZ7Vz",
        "duration_ms": 232800,
        "time_signature": 4
      }
  ]
}

Here is my current query:

[
    {
        '$lookup': {
            'from': 'tracks', 
            'localField': 'tracks', 
            'foreignField': '_id', 
            'as': 'tracks'
        }
    }, 
    {
        '$project': {
            'tracks.track_name': 1, 
            'tracks.artist_name': 1, 
            'tracks.album_name': 1, 
            'tracks.track_id': 1, 
            'user_id': 1, 
            'playlist_name': 1
        }
    }
]

I am using MongoClient from PyMongo package for my aggregation queries.

I have tried using $lookup, $addField, $sort and many other aggregation functions to get it to work.


Solution

  • Yes, currently your $lookup stage will override the tracks array field and not preserve the order of the elements in the original tracks array.

    In this case, I suggest having another field for storing the linked/reference tracks array from the tracks collection (linkedTracks). Then you need to iterate the existing tracks array and transform each element to mapped document in the linkedTracks array via $map operator.

    db.playlists.aggregate([
      {
        "$lookup": {
          "from": "tracks",
          "localField": "tracks",
          "foreignField": "_id",
          "as": "linkedTracks"
        }
      },
      {
        $set: {
          tracks: {
            $map: {
              input: "$tracks",
              as: "t",
              in: {
                $first: {
                  $filter: {
                    input: "$linkedTracks",
                    cond: {
                      $eq: [
                        "$$t",
                        "$$this._id"
                      ]
                    }
                  }
                }
              }
            }
          }
        }
      },
      {
        "$project": {
          "tracks.track_name": 1,
          "tracks.artist_name": 1,
          "tracks.album_name": 1,
          "tracks.track_id": 1,
          "user_id": 1,
          "playlist_name": 1
        }
      }
    ])
    

    Demo @ Mongo Playground