pythonmongodbaggregation-frameworkpymongopipeline

PyMongo - Group by year based on subdocument date


I have a MongoDB document like:

[
  {
    _id: ObjectId('67cfd69ba3e561d35ee57f51'),
    created_at: ISODate('2025-03-11T06:22:19.044Z'),
    conversation: [
      {
        id: '67cfd6c1a3e561d35ee57f53',
        feedback: {
          liked: false,
          disliked: true,
          copied: true,
          created_at: ISODate('2025-03-11T06:27:48.634Z')
        }
      },
      {
        id: '67cfd77fa3e561d35ee57f54',
        feedback: {
          liked: true,
          disliked: false,
          copied: false,
          created_at: ISODate('2025-03-11T06:28:25.099Z')
        }
      },
      { id: '67d009f1a3e561d35ee57f5a', feedback: null },
      { id: '67d009f8a3e561d35ee57f5b', feedback: null }
    ]
  },
  {
    _id: ObjectId('67d00aeaa3e561d35ee57f5d'),
    created_at: ISODate('2025-03-11T10:05:30.848Z'),
    conversation: [
      { id: '67d00af7a3e561d35ee57f5f', feedback: null },
      { id: '67d00afaa3e561d35ee57f60', feedback: null }
    ]
  }
]

Where the main document has a conversation subdocument, I want to know how many likes, dislikes and copied data in each year.

I tried to get year from the conversation.feedback.created_at using $dateToString operator.

pipeline = [
  { 
    '$match': { 'conversation.feedback.copied': True }
  },
  { 
    '$group': {
      '_id': { 
        '$dateToString': {
          'format': '%Y',
          'date': '$conversation.feedback.created_at'
        }
      },
      'total_copied': { '$sum': 1 }
    }
  }
]

But it gives an error:

OperationFailure: PlanExecutor error during aggregation :: caused by :: can't convert from BSON type array to Date, full error: {'ok': 0.0, 'errmsg': "PlanExecutor error during aggregation :: caused by :: can't convert from BSON type array to Date", 'code': 16006, 'codeName': 'Location16006'}

What I am expecting out as:

{
    "2025": {
        "total_liked": 1,
        "total_disliked": 1,
        "total_copied": 1
    }
}

How to convert the DateTime object to year and combine the total counts for 3 parameters?


Solution

    1. You need the $unwind stage to deconstruct the conversation array before grouping by conversation.feedback.created_at.

    2. Note that, in your sample data, there is possibly the conversation.feedback is null. Hence you should remove those unwinded document with conversation.feedback is null.

    3. For calculating the sum based on the boolean value, you can work with $cond to add 1 when the value is true.

    4. If you are looking for the generated output with key-value pair, you may look for $replaceRoot and $arrayToObject to convert list of objects to key-value pair.

    db.collection.aggregate([
      {
        "$match": {
          "conversation.feedback.copied": true
        }
      },
      {
        "$unwind": "$conversation"
      },
      {
        "$match": {
          "conversation.feedback": {
            "$ne": null
          }
        }
      },
      {
        "$group": {
          "_id": {
            "$dateToString": {
              "format": "%Y",
              "date": "$conversation.feedback.created_at"
            }
          },
          "total_copied": {
            "$sum": {
              $cond: [
                {
                  $eq: [
                    "$conversation.feedback.copied",
                    true
                  ]
                },
                1,
                0
              ]
            }
          },
          "total_liked": {
            "$sum": {
              "$cond": [
                {
                  "$eq": [
                    "$conversation.feedback.liked",
                    true
                  ]
                },
                1,
                0
              ]
            }
          },
          "total_disliked": {
            "$sum": {
              "$cond": [
                {
                  "$eq": [
                    "$conversation.feedback.disliked",
                    true
                  ]
                },
                1,
                0
              ]
            }
          }
        }
      },
      {
        "$replaceRoot": {
          "newRoot": {
            "$arrayToObject": [
              [
                {
                  "k": "$_id",
                  "v": {
                    "total_copied": "$total_copied",
                    "total_liked": "$total_liked",
                    "total_disliked": "$total_disliked"
                  }
                }
              ]
            ]
          }
        }
      }
    ])
    

    Demo @ Mongo Playground