pythonmongodbpymongo-3.x

pymongo create a dict based on a group match


I'm new to pymongo and can't quite wrap my head around the logic of creating a nested group and match.

Im trying to find all the unique models types and then create a list of the logical names used for the models.

The output that I want to create, or close to it.

    {
       {"Model-1", {Devices: ["Name-1","Name-4"]},
       {"Model-2", {Devices: ["Name-2","Name-3"]}
    }

Mongo DB Data:

[
  {
    "_id": "1",
    "cdate": {
      "$date": "2023-11-16T00:00:00.000Z"
    },
    "AP Name": "Name-1",
    "Model": "Model-1"
  },
  {
    "_id": "2",
    "cdate": {
      "$date": "2023-11-16T00:00:00.000Z"
    },
    "AP Name": "Name-2",
    "Model": "Model-2"
  },
  {
    "_id": "3",
    "cdate": {
      "$date": "2023-11-16T00:00:00.000Z"
    },
    "AP Name": "Name-3",
    "Model": "Model-2"
  },
  {
    "_id": "4",
    "cdate": {
      "$date": "2023-11-16T00:00:00.000Z"
    },
    "AP Name": "Name-4",
    "Model": "Model-1"
  }
]

My code: I dont think this is the best way to do this.. so any help or suggestions woudld be great.

def mongo_aggregate_tags(foo_coll, foo_keyword, foo_match, x):
    agg_sites = foo_coll.aggregate(
        [
            {"$match": {f"{foo_match}": x}},
            {"$group": {"_id": f"${foo_keyword}"}},
            {"$sort": {foo_keyword: 1}},
        ]
    )
    return agg_sites


for f_item in coll_inv.distinct("Model"):
    for x in mongo_aggregate_tags(coll_inv, "Model", "AP Name", f_item):
        print(x, " ", f_item)
        print(type(x))

The output I actually get looks like this:

{'_id': 'Model-1'}   Name-1
{'_id': 'Model-1'}   Name-4
{'_id': 'Model-2'}   Name-2
{'_id': 'Model-2'}   Name-3

Solution

  • The output you want {"Model-1", {Devices: ["Name-1","Name-4"]} is not a dict. It's a set. And sets can't contain dicts. So that would be invalid. Also, the outer {} braces makes it a set within a set but sets can't contain other sets.

    But this would be valid, a list of dicts:

    [
        {Model: "Model-1", {Devices: ["Name-1","Name-4"]},
        {Model: "Model-2", {Devices: ["Name-2","Name-3"]}
    ]
    

    In your aggregation pipeline, you have a $group stage but it's missing an Accumulator. So there's no action for the documents within each group. (You can have more than one accumulator in a group.)

    This aggregation pipeline provides the output you want. I have removed the match stage because it's unclear what you need from that.

    db.collection.aggregate([
      {
        "$group": {
          "_id": "$Model",
          "Devices": { "$push": "$AP Name" }
        }
      },
      {
        "$project": {
          "_id": 0,
          "Model": "$_id",
          "Devices": 1
        }
      },
      { "$sort": { "Model": 1 } }
    ])
    

    Mongo Playground, result is as above.


    Your Python function would then be:

    
    def mongo_aggregate_tags(foo_coll, grouping, array_field):
        agg_sites = foo_coll.aggregate(
            [
                {
                    "$group": {
                        "_id": f"${grouping}",
                        "Devices": {"$push": f"${array_field}"},
                    }
                },
                {
                    "$project": {
                        "_id": 0,
                        grouping: "$_id",
                        "Devices": 1,
                    }
                },
                {"$sort": {grouping: 1}},
            ]
        )
        return agg_sites
    
    
    for x in mongo_aggregate_tags(coll_inv, "Model", "AP Name"):
        print(x)
    

    (I skipped the x and f_item params since its purpose is also unclear.)

    To take the example further, I've added a "Colour" field, so it can be used like this:

    print("Models by Colour:")
    for x in mongo_aggregate_tags(coll_inv, "Model", "Colour"):
        print(x)
    
    # or group by Colour
    print("Colours by Models:")
    for x in mongo_aggregate_tags(coll_inv, "Colour", "Model"):
        print(x)
    

    Output:

    Models by Colour:
    {'Devices': ['red', 'green'], 'Model': 'Model-1'}
    {'Devices': ['green', 'blue'], 'Model': 'Model-2'}
    Colours by Models:
    {'Devices': ['Model-2'], 'Colour': 'blue'}
    {'Devices': ['Model-2', 'Model-1'], 'Colour': 'green'}
    {'Devices': ['Model-1'], 'Colour': 'red'}
    

    (you can change the 'label' for the accumulated/array field and pass that in as a parameter to the function.)