I have an aggregate Mongo query that projects some fields and calculates two other ones using $sum. The query works as expected, so I created an unit test for it, and to my surprise the test was failing.
I created a minimal, complete, and verifiable example to test my hypothesis that this was a problem with MongoMock, and it seems to be!
Here is the code:
import mongoengine as mongo
from mongoengine import connect
from mongoengine.queryset import QuerySet
class ValuesList(mongo.EmbeddedDocument):
updated_value = mongo.DecimalField()
class ValuesHistory(mongo.Document):
name = mongo.StringField()
base_value = mongo.DecimalField()
values_list = mongo.EmbeddedDocumentListField(ValuesList, required=False)
meta = {
'collection' : 'values_history'
}
def __str__(self):
return 'name: {}\nbase_value: {}\n'.format(self.name, self.base_value)
def migrate_data(new_collection):
ValuesHistory.objects.aggregate(
{'$project': {'name': 1,
'base_value': {'$sum': ['$base_value', {'$arrayElemAt': ['$values_list.updated_value', -1]}]}
}
},
{'$out': "{}".format(new_collection)}
)
def clear_tables_and_insert_test_data(db):
db.test.values_history.drop()
db.test.updated_values.drop()
ValuesHistory(name='first',
base_value=100,
values_list=[ValuesList(updated_value=5),
ValuesList(updated_value=15)]).save()
def run_aggregate_query_with_db(db):
new_collection_name = 'updated_values'
migrate_data(new_collection_name)
new_group = ValuesHistory.switch_collection(ValuesHistory(), new_collection_name)
aggregated_values = QuerySet(ValuesHistory, new_group._get_collection()).all()
for value in aggregated_values:
print(value)
db.close()
A quick explanation about the code above.
ValuesHistory is a class that contains a String name, a numeric base_value and a list of values (ValuesList class).
The method clear_tables_and_insert_test_data clears the two tables used in this test and inserts some test data.
The query in migrate_data method creates a new collection (through the $out operator) and the base_value of the newly created collection should be the sum of the current value and the last value in the values_list list. In my case it should be 115 (being 100 the current value and 15 the last value on the list).
If I run the code using a connection to my local MongoDB, like this:
if __name__ == '__main__':
db = connect('test') # connect to real instance of Mongo
clear_tables_and_insert_test_data(db)
run_aggregate_query_with_db(db)
I get 115 as a result, which is exactly what is expected.
If I, instead, use a connection to MongoMock:
if __name__ == '__main__':
db = connect('test', host='mongomock://localhost') # Connect to MongoMock instance
clear_tables_and_insert_test_data(db)
run_aggregate_query_with_db(db)
I get 100 as result, which is odd! Looks like the $sum operator did not do it's job properly, since the sum of 100 and 15 resulted in 100!
EDIT: I also tried using the $add operator, but the problem remains the same, yielding 100 when it should be 115.
TL;DR;
Question: How should I use $sum (or $add) inside an aggregate pipeline on MongoMock so that it yields the correct value?
The problem described was, indeed, a bug on Mongomock that existed up to version 3.14.0.
After the original question was posted, I opened an issue on Mongomock's github describing the problem. Shortly after it was fixed and version 3.15.0 has been released a few days ago. I ran the code on the question and the issue is now solved for both $add and $sum operators!
TL;DR
Updating to Mongomock 3.15.0 is enough to solve the problem.