I'm trying to understand how and why it's so hard to load my referenced data, in unmongo/pymongo
@instance.register
class MyEntity(Document):
account = fields.ReferenceField('Account', required=True)
date = fields.DateTimeField(
default=lambda: datetime.utcnow(),
allow_none=False
)
positions = fields.ListField(fields.ReferenceField('Position'))
targets = fields.ListField(fields.ReferenceField('Target'))
class Meta:
collection = db.myentity
when i retrieve this with:
def find_all(self):
items = self._repo.find_all(
{
'user_id': self._user_id
}
)
return items
and then dump it like so:
from bson.json_util import dumps
all_items = []
for item in all_items:
all_items.append(item.dump())
return dumps(all_items)
i get the following JSON object:
[
{
"account": "5e990db75f22b6b45d3ce814",
"positions": [
"5e9a594373e07613b358bdbb",
"5e9a594373e07613b358bdbe",
"5e9a594373e07613b358bdc1"
],
"date": "2020-04-18T01:34:59.919000+00:00",
"id": "5e9a594373e07613b358bdcb",
"targets": [
"5e9a594373e07613b358bdc4",
"5e9a594373e07613b358bdc7",
"5e9a594373e07613b358bdca"
]
}
]
and without dump
<object Document models.myentity.schema.MyEntity({
'targets':
<object umongo.data_objects.List([
<object umongo.frameworks.pymongo.PyMongoReference(
document=Target,
pk=ObjectId('5e9a594373e07613b358bdc4')
)>,
<object umongo.frameworks.pymongo.PyMongoReference(
document=Target,
pk=ObjectId('5e9a594373e07613b358bdc7')
)>,
<object umongo.frameworks.pymongo.PyMongoReference(
document=Target,
pk=ObjectId('5e9a594373e07613b358bdca'))>]
)>,
'id': ObjectId('5e9a594373e07613b358bdcb'),
'positions':
<object umongo.data_objects.List([
<object umongo.frameworks.pymongo.PyMongoReference(
document=Position,
pk=ObjectId('5e9a594373e07613b358bdbb')
)>,
<object umongo.frameworks.pymongo.PyMongoReference(
document=Position,
pk=ObjectId('5e9a594373e07613b358bdbe'))>,
<object umongo.frameworks.pymongo.PyMongoReference(
document=Position,
pk=ObjectId('5e9a594373e07613b358bdc1'))>])>,
'date': datetime.datetime(2020, 4, 18, 1, 34, 59, 919000),
'account': <object umongo.frameworks.pymongo.PyMongoReference(document=Account, pk=ObjectId('5e990db75f22b6b45d3ce814'))>
})>
i.e. what if there's a reference field in 'target' as well? I understand this can be expensive on the DB, but is there some way to specify this on the schema definition itself? i.e. in meta class, that i always want the full, dereferenced object for a particular field?
I'm not that new to mongo, but new to python / mongo. I feel like I'm missing something fundamental here.
EDIT: so right after posting, i did find this issue:
https://github.com/Scille/umongo/issues/42
which provides a way forward
is this still the best approach? Still trying to understand why this is treated like an edge case.
EDIT 2: progress
class MyEntity(Document):
account = fields.ReferenceField('Account', required=True, dump=lambda: 'fetch_account')
date = fields.DateTimeField(
default=lambda: datetime.utcnow(),
allow_none=False
)
#trade = fields.DictField()
positions = fields.ListField(fields.ReferenceField('Position'))
targets = fields.ListField(fields.ReferenceField('Target'))
class Meta:
collection = db.trade
@property
def fetch_account(self):
return self.account.fetch()
so with the newly defined property decorator, i can do:
items = MyEntityService().find_all()
allItems = []
for item in allItems:
account = item.fetch_account
log(account.dump())
allItems.append(item.dump())
When I dump account, all is good. But I don't want to explicitly/manually have to do this. It still means I have to recursively unpack and then repack each referenced doc, and any child references, each time I make a query. It also means the schema SOT is no longer contained just in the umongo class, i.e. if a field changes, I'll have to refactor every query that uses that field.
I'm still looking for a way to decorate/flag this on the schema itself. e.g.
account = fields.ReferenceField('Account', required=True, dump=lambda: 'fetch_account')
dump=lambda: 'fetch_account'
i just made up, it doesn't do anything, but that's more or less the pattern i'm going for, not sure if this is possible (or even smart: other direction, pointers to why i'm totally wrong in my approach are welcome) ....
EDIT 3: so this is where i've landed:
@property
def fetch_account(self):
return self.account.fetch().dump()
@property
def fetch_targets(self):
targets_list = []
for target in self.targets:
doc = target.fetch().dump()
targets_list.append(doc)
return targets_list
@property
def fetch_positions(self):
positions_list = []
for position in self.positions:
doc = position.fetch().dump()
positions_list.append(doc)
return positions_list
and then to access:
allItems = []
for item in items:
account = item.fetch_account
positions = item.fetch_positions
targets = item.fetch_targets
item = item.dump()
item['account'] = account
item['positions'] = positions
item['targets'] = targets
# del item['targets']
allTrades.append(item)
I could clean it up/abstract it some, but i don't see how i could really reduce the general verbosity at at this point. It does seem to be give me the result i'm looking for though:
[
{
"date": "2020-04-18T01:34:59.919000+00:00",
"targets": [
{
"con_id": 331641614,
"value": 106,
"date": "2020-04-18T01:34:59.834000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdc4"
},
{
"con_id": 303019419,
"value": 0,
"date": "2020-04-18T01:34:59.867000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdc7"
},
{
"con_id": 15547841,
"value": 9,
"date": "2020-04-18T01:34:59.912000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdca"
}
],
"account": {
"user_name": "hello",
"account_type": "LIVE",
"id": "5e990db75f22b6b45d3ce814",
"user_id": "U3621607"
},
"positions": [
{
"con_id": 331641614,
"value": 104,
"date": "2020-04-18T01:34:59.728000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdbb"
},
{
"con_id": 303019419,
"value": 0,
"date": "2020-04-18T01:34:59.764000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdbe"
},
{
"con_id": 15547841,
"value": 8,
"date": "2020-04-18T01:34:59.797000+00:00",
"account": "5e990db75f22b6b45d3ce814",
"id": "5e9a594373e07613b358bdc1"
}
],
"id": "5e9a594373e07613b35
8bdcb"
}
]
It seems like this is a design choice in umongo.
In Mongoid for example (the Ruby ODM for MongoDB), when an object is referenced it is fetched from the database automatically through associations as needed.
As an aside, in an ODM the features of "define a field structure" and "seamlessly access data through application objects" are quite separate. For example, my experience with Hibernate in Java suggests it is similar to what you are discovering with umongo - once the data is loaded, it provides a way of accessing the data using application-defined field structure with types etc., but it doesn't really help with loading the data from application domain transparently.