pythonpython-3.xpymongopymongo-3.x

Pymongo not modifying all matching documents


I'm trying to update around 100k documents using pymongo 3.12. I believe I'm using the pymongo api correctly, but every time I run a bulk write; it only updates roughly half of the documents that it matches with.

upserts = [UpdateOne({'_id': event["$set"]["metaData"]["id"]}, {'$set': event["$set"]}, upsert=True)
                   for event in events_update_data]
result = cursor.bulk_write(upserts, ordered=False)
print(result.bulk_api_result)

Results in:

{'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 0, 'nUpserted': 427, 'nMatched': 116940, 'nModified': 43303, 'nRemoved': 0, 'upserted': removed_by_me}

So why is it matching with all the documents I need it to, but only modifying half? I have to run this multiple times for it to update all docs, and the consistency varies greatly. Here is the operation for each document.

data = {"$set": {f'inventoryData.{self.parse_time}': event.pop("inventory"), 'metaData': event}}

I've also tried different variations of upserting. All reproduce the same result.

bulk_operations = cursor.initialize_unordered_bulk_op()
    for event in events_update_data:
        bulk_operations.find({'_id':  event["$set"]["metaData"]["id"]}).upsert().update({"$set": event["$set"]})
        result = bulk_operations.execute()
        print(result)

and

        upserts = []
        for event in events_update_data:
            upserts.append(UpdateOne({'_id': event["$set"]["metaData"]["id"]}, {'$set': event["$set"]}, upsert=True))
            if len(upserts) == 1000:
                try:
                    result = cursor.bulk_write(upserts, ordered=False)
                    upserts = []
                except BulkWriteError as e:
                    print(e.details)
                print(result.bulk_api_result)

Solution

  • If your nMatched is higher then your nModified, then chances are your "updates" are the same as the original record, hence not modified.