symfonydoctrinecommandquery-buildermemory-limit

Is there a fastest way to hydrate my table


I'm actually migrating datas using a Symfony command

Basically the goal is to select all my identities in my application (around 100,000 entries) and get the identity as well as the name in order to hydrate my 'identitie_name' table which has an 'identitie' field (relation) and a 'name' field (which comes from the Identitie entity)

My logic works but it's extremely slow (I call my command with php -d memory_limit=-1 bin/console my-command)

Here is my logic in my Repository and in my Command, do you have a tip to speed up this process?

Command.php

protected function execute(InputInterface $input, OutputInterface $output) {

        $this->io = new SymfonyStyle($input, $output);
        $identities = $this->identitieRepository->getAllIdentities();
                $output->writeln('progressing...');

        foreach($identities as $identitie) {
            $this->insertIntoIdentitieName($identitie, $identitie->getName());
            $this->entityManager->flush();
        }

        $this->io->success('good!');
        
        return 0;
    }

    private function insertIntoIdentiteName($identitieId, $name) {

    $identitieName = new IdentiteName();
    $identitieName ->setIdentite($identitieId);
    $identitieName ->setName($name);
    $identitieName ->setActive(true); 

    $this->entityManager->persist($identitieName );
    }

Repository.php

public function getAllIdentities()
    {
        $query = $this->getEntityManager()->createQueryBuilder()
            ->select('i')->from('App\Entity\Identitie', 'i')
            ->orderBy('i.id', 'DESC');

        return $query->getQuery()->getResult();
    }

Solution

  • Your current approach to migrating data in Symfony is functional but inefficient due to the use of $entityManager->flush() inside the loop. This causes the Doctrine ORM to execute a database transaction for each identity, which is slow. To optimize this, you can simply move flush() outside loop or use batch processing.

    Simplest approach

    foreach($identities as $identitie) {
        $this->insertIntoIdentitieName($identitie, $identitie->getName());
    }
    
    $this->entityManager->flush();
    

    Batch processing example:

    $i = 0;
    $batchSize = 20;
    
    foreach($identities as $identitie) {
        $this->insertIntoIdentitieName($identitie, $identitie->getName());
    
        if (($i % $batchSize) === 0) {
            $this->entityManager->flush();
            $this->entityManager->clear(); // Optionally, if you want to save some RAM
        }
    
        $i++;
    }
    
    $this->entityManager->flush();
    $this->entityManager->clear();
    

    With such large datasets, I would also consider retrieving your identity data from the database in batches.