phpmongodbperformanceyii2batch-updates

How to add a new attribute to thousands of Mongodb documents in yii2 performant?


I have php + yii2 + mongodb running on linux in virtualbox (10 cores, 12 GB RAM). yii2-mongodb is installed. I have a csv file that contains 100.000+ records in 2 columns: custom identifier (not _id), and a number. I would like to add the number to the documents in the collection in mongodb where the custom identifier matches the document. I was trying several different approaches, like 1. $collection->update() in the while(fgetcsv) loop, 2. tried to create an array in while() like:

$command = $conn->createCommand();

while(fgetcsv...) {
    $command->addUpdate(['identifier' => $csvrow[0]], ['number' => $csvrow[1]]);
}

$command->executeBatch('collection');

it works but it is terribly slow. 10-20 records per second. I have tried to add a text index to the custom identifier in mongodb compass (I have 1(asc), -1(desc), 2dsphere and text), got no performance boost there. What am I missing?


Solution

  • Looks like you did not create an index or you created a wrong index. Use this one

    db.collection.createIndexes([ { identifier: 1 } ], { unique: true })
    

    By using bulkWrite it should get even faster. In order to avoid "out of memory" problems I would suggest

    $ops = [];
    while(fgetcsv...) {
        array_push($ops, [ 'updateOne' => [ [ 'identifier' => $csvrow[0] ], [ '$set' => [ 'number' => $csvrow[1] ] ] ] ]);
        if (count($ops)) > 10000 {
            $conn->createCommand()->addUpdate($ops)->executeBatch('collection');
            $ops = [];
        }
    }
    if (count($ops)) > 0 
       $conn->createCommand()->addUpdate($ops)->executeBatch('collection');