I'm trying to update around 100k documents using pymongo 3.12. I believe I'm using the pymongo api correctly, but every time I run a bulk write; it only updates roughly half of the documents that it matches with.
upserts = [UpdateOne({'_id': event["$set"]["metaData"]["id"]}, {'$set': event["$set"]}, upsert=True)
for event in events_update_data]
result = cursor.bulk_write(upserts, ordered=False)
print(result.bulk_api_result)
Results in:
{'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 0, 'nUpserted': 427, 'nMatched': 116940, 'nModified': 43303, 'nRemoved': 0, 'upserted': removed_by_me}
So why is it matching with all the documents I need it to, but only modifying half? I have to run this multiple times for it to update all docs, and the consistency varies greatly. Here is the operation for each document.
data = {"$set": {f'inventoryData.{self.parse_time}': event.pop("inventory"), 'metaData': event}}
I've also tried different variations of upserting. All reproduce the same result.
bulk_operations = cursor.initialize_unordered_bulk_op()
for event in events_update_data:
bulk_operations.find({'_id': event["$set"]["metaData"]["id"]}).upsert().update({"$set": event["$set"]})
result = bulk_operations.execute()
print(result)
and
upserts = []
for event in events_update_data:
upserts.append(UpdateOne({'_id': event["$set"]["metaData"]["id"]}, {'$set': event["$set"]}, upsert=True))
if len(upserts) == 1000:
try:
result = cursor.bulk_write(upserts, ordered=False)
upserts = []
except BulkWriteError as e:
print(e.details)
print(result.bulk_api_result)
If your nMatched
is higher then your nModified
, then chances are your "updates" are the same as the original record, hence not modified.