jsonmongodbaggregation-frameworkcalculated-field

Split array-like string values in MongoDB aggregation


A field in my data is stored as a string instead of a real nested array:

{
  "uuid": "yxcvb", 
  "product": "[{\"pid\":\"4f76c06825aa486db9a7a5cedbc2ea19\",\"size\":\"S\"},{\"pid\":\"d3055e0a1bb040dba65674b9733bf4ce\",\"size\":\"M\"}]",
  "timestamp":{"$date":"2023-01-01T11:25:47+00:00"}
},
{
  "uuid": "asdfg", 
  "product": "[{\"pid\":\"369b33cc768d496193b18d657d706920\",\"size\":\"M\"}]",
  "timestamp":{"$date":"2023-01-02T11:25:47+00:00"}
},
{
  "uuid": "qwert", 
  "product": "[{\"pid\":\"13325cb71fd7413c8f3e18caf20b5d6d\",\"size\":\"S\"},{\"pid\":\"ac431397d49d449cab44d98dc13ec57c\",\"size\":\"M\"},{\"pid\":\"ba35b9f804a44e15bf197d3ef671dc34\",\"size\":\"L\"}]",
  "timestamp":{"$date":"2023-01-03T11:25:47+00:00"}
}

/// "product" has a maximum of 4 dict-like pid-size units

I would like to know the count of each size and group by timestamp. Expected output:

[
  {
    "week": 52,
    "sizeS": 1
  },
  {
    "week": 52,
    "sizeM": 1
  },
  {
    "week": 1,
    "sizeS": 1
  },
  {
    "week": 1,
    "sizeM": 2
  },
  {
    "week": 1,
    "sizeL": 1
  }
]

To realize this, the following problems need to be solved:

  1. Turn the array-like string into a real string, then I could use $unwind.
  2. If 1 cannot be achieved, extract all pid and size values in a document, and save only the value after the colon (e.g. "4f76c06825aa486db9a7a5cedbc2ea19", "M").
  3. After 2, count total records of each size within the grouped timeframe (week).

Solution

  • You can use $function to parse the json strings and use it as normal mongodb fields.

    db.collection.aggregate([
      {
        "$set": {
          "product": {
            "$function": {
              "body": "function(product) {return JSON.parse(product)}",
              "args": [
                "$product"
              ],
              "lang": "js"
            }
          }
        }
      },
      {
        "$unwind": "$product"
      },
      {
        $group: {
          _id: {
            week: {
              "$isoWeek": "$timestamp"
            },
            size: "$product.size"
          },
          count: {
            $sum: 1
          }
        }
      },
      {
        $sort: {
          "_id.week": -1
        }
      }
    ])
    

    Mongo Playground