phpsymfonydoctrinesymfony-2.8

Getting a managed and dirty entity error on persist


I have the following piece of code:

$count = 0;
/** @var Order $order */
foreach ($orders as $order) {
    $output->write(sprintf('Processing order %s... ', ($count + 1), $order->getReference()));
    if ($this->shouldSkip($order)) {
        $output->writeln('Doesn\'t meet criteria, skipping.');
        continue;
    }
    $paymentPlanName = PayoutPlanFactory::make($order)->getPayoutPlanName();
    $output->write(sprintf('Setting payment plan to %s...', $paymentPlanName));
    $order->setPayoutPlan($paymentPlanName);
    $em->persist($order);
    $output->writeln(' Done.');

    if ((++$count % 1000) === 0) {
        $output->write('Flushing records... ');
        $em->flush();
        $output->writeln('Done.');
        $output->write('Clearing em... ');
        $em->clear();
        $output->writeln('Done.');
    }
}
$em->flush();
$em->clear();

When I run this, the output is exactly what I expect it to be for the first 1000 records. I get the order reference, then the payout plan that it's setting it to. I also look in the DB, and my records are updated there. However, the moment it starts with row 1001, I get the following error:

  [Doctrine\ORM\ORMInvalidArgumentException]                                      
  A managed+dirty entity 57b6fed7b4ad7 can not be scheduled for insertion. 

That id matches an order reference.

I first thought it might be a problem with entity 1001, so changed my batch size to 50, and it then happens on the 51st entity it's trying to save. It's almost like $em->clear(); isn't doing what it's supposed to do.

Is there some way I can work around this problem?


Solution

  • If your orders come from database using a repository function, the first clear detach them all. And that's why after the first batch of 1000 nothing work anymore. Doctrine get mindfu****. haha

    But you're not far from the solution.

    Try something like this on top of your function

     $orders = $this->$orderRepository->findAll();
     $orderIds = array_map(function (Order $value) {
                    return $value->getId();
     }, $orders);
    
     // this way you only iterate over an array of ids
     $this->em->clear() // detach all orders from entity manager : freeing memory
     $count = 0;
     foreach($orderIds => $orderId){
       
        $order = $this->$orderRepository->find($orderId);
        // your code here
     }
    

    This way you achieve what you aim too : keep performance over large records by freeing memory.

    Good to know : ->persist($object) is usefull only for new entity. If you object come from any repository function, it is already persisted.

    This answer is great for middle to large dataset. Because of the ->find() you will have an amount of query equal to the amount of records. Which can be bad for very large amount of data. Since querying by id is VERY FAST in good and recent database version this could be enough.

    But for very large and very large dataset. Use batch method of doctrine which is also simple : https://www.doctrine-project.org/projects/doctrine-orm/en/2.14/reference/batch-processing.html#iterating-results Or
    https://www.doctrine-project.org/projects/doctrine-orm/en/2.14/reference/batch-processing.html#iterating-results