pythonmongodbmongoengineflask-mongoenginenosql

MongoEngine: storing EmbeddedDocument in DictField


I'm modelling a MongoDB database in MongoEngine for a web project. I want to store the data in a slightly unusual way to be able to efficiently query it later.

Our data in MongoDB looks something like this:

// "outer"
{  
  "outer_data": "directors",
  "embed": {
     "some_md5_key": { "name": "P.T. Anderson" },
     "another_md5_key": { "name": "T. Malick" },
     ...
   }
}

My first instinct was to model it like this in MongoEngine:

class Inner(EmbeddedDocument):
  name = StringField()

class Outer(Document):
  outer_data = StringField()
  embed = DictField(EmbeddedDocument(Inner))  # this isn't allowed but you get the point

In other words, what I essentially want is the same an storing an EmbeddedDocument in a ListField but rather in a DictField with dynamic keys for each EmbeddedDocument.

Example that is allowed with a ListField for reference:

class Inner(EmbeddedDocument):
  inner_id = StringField(unique=True)  # this replaces the dict keys
  name = StringField()

class Outer(Document):
  outer_data = StringField()
  embed = ListField(EmbeddedDocument(Inner))

I would prefer to have MongoEngine objects returned also for the nested "Inner" documents while still using a DictField + EmbeddedDocument (as dict "value"). How can I model this in MongoEngine? Is it even possible or do I have to naively place all data under a generic DictField?


Solution

  • I finally found the answer to my problem. The correct way to achieve this pattern is by making use of a MapField.

    The corresponding model in MongoEngine looks like:

    class Inner(EmbeddedDocument):
      name = StringField()
    
    class Outer(Document):
      outer_data = StringField()
      embed = MapField(EmbeddedDocumentField(Inner))
    

    In MongoDB, all keys needs to be strings so there is no need to specify a "field type" for the keys in the MapField.