memory-leakslaravel-8

Laravel loop memory leak


I have a memory leak problem (I think that's what it is called) which I couldn't solve after numerous tries in 24 hours.

Background: I have an articles table (~37M rows) with article_id and journal_id columns, journal_specialty table (~29K rows) with journal_id and specialty_id columns. I want to populate article_specialty table with article_id and specialty_id, using the data from the 2 other tables - writing which specialty(ies) each article is on, based on the journal's specialties they are published on.

My code:

namespace App\Console\Commands\Articles;

use Illuminate\Console\Command;
use Illuminate\Support\Facades\DB;

class xa02_ArticleSpecialty extends Command
{
  protected $signature = 'articles:xa02-article-specialty';
  protected $description = 'Populate article_specialty table using journal_specialty relationships';

  private const BATCH_SIZE = 50000;
  private const INSERT_CHUNK_SIZE = 25000;

  public function handle()
  {
    $this->info('Starting article_specialty population...');
    $startTime = microtime(true);

    $journalSpecialties = DB::table('journal_specialty')
      ->select('journal_id', 'specialty_id')
      ->get()
      ->groupBy('journal_id')
      ->map(fn($group) => $group->pluck('specialty_id')->toArray())
      ->toArray();

    $totalRecords = DB::select("SHOW TABLE STATUS LIKE 'articles'")[0]->Rows;

    $this->info("Total articles to process: $totalRecords");

    $bar = $this->output->createProgressBar($totalRecords);
    $bar->start();

    $offset = 0;

    // Debugging: Show initial memory usage
    $this->info('Initial memory usage: ' . memory_get_usage() . ' bytes');

    while ($offset < $totalRecords) {
      $articles = DB::table('articles')
        ->orderBy('date_e')
        ->limit(self::BATCH_SIZE)
        ->offset($offset)
        ->get()
        ->toArray();

      // Debugging: Show memory usage after fetching articles
      $this->info('Memory usage after fetching articles: ' . memory_get_usage() . ' bytes');

      if (empty($articles)) {
        break; // ✅ No more data to process
      }

      $insertData = [];

      foreach ($articles as $article) {
        if (isset($journalSpecialties[$article->journal_id])) {
          foreach ($journalSpecialties[$article->journal_id] as $specialty_id) {
            $insertData[] = [
              'article_id' => $article->article_id,
              'specialty_id' => $specialty_id
            ];
          }
        }

        if (count($insertData) >= self::INSERT_CHUNK_SIZE) {
          DB::table('article_specialty')->insertOrIgnore($insertData);
          $insertData = []; // ✅ Free memory immediately
        }
      }

      // Debugging: Show memory usage after insert
      $this->info('Memory usage after inserting: ' . memory_get_usage() . ' bytes');

      if (!empty($insertData)) {
        DB::table('article_specialty')->insertOrIgnore($insertData);
      }

      $bar->advance(count($articles));

      // Trying to free memory after processing
      $articles = null;
      $insertData = null;
      gc_collect_cycles();
      clearstatcache();
      DB::table('articles')->newQuery();
      DB::flushQueryLog();

      $offset += self::BATCH_SIZE;

      // Debugging: Show memory usage after processing batch
      $this->newLine();
      $this->info('Memory usage after processing batch: ' . memory_get_usage() . ' bytes');
    }

    $bar->finish();
    $this->newLine();

    $totalTime = microtime(true) - $startTime;
    $this->info('✅ Completed! Processed articles in ' . gmdate("H:i:s", $totalTime) . '.');

    // Debugging: Show final memory usage
    $this->info('Final memory usage: ' . memory_get_usage() . ' bytes');
  }
}

Outcome:

Starting article_specialty population...
Total articles to process: 37765760
        0/37765760 [>---------------------------]   0%
Initial memory usage: 35593704 bytes
Memory usage after fetching articles: 141872128 bytes
Memory usage after inserting: 147389216 bytes
   100000/37765760 [>---------------------------]   0%
Memory usage after processing batch: 41217656 bytes
Memory usage after fetching articles: 145440808 bytes
Memory usage after inserting: 155017720 bytes
   200000/37765760 [>---------------------------]   0%
Memory usage after processing batch: 46857472 bytes
Memory usage after fetching articles: 151319400 bytes
Memory usage after inserting: 161351936 bytes
   300000/37765760 [>---------------------------]   0%
Memory usage after processing batch: 52506008 bytes
Memory usage after fetching articles: 156635200 bytes
Memory usage after inserting: 166937624 bytes
...

It keeps accumulating until it stops with an error, exhausting memory:

PHP Fatal error:  Allowed memory size of 536870912 bytes exhausted (tried to allocate 1052672 bytes)

What am I missing???


Solution

  • Finally! I got it!!! The issue was caused by Laravel’s event dispatcher, which was accumulating event listeners during batch inserts, leading to memory exhaustion. The fix was simply to disable it:

    DB::connection()->unsetEventDispatcher();