laravelswiftmaileramazon-seslaravel-queue

Swift_TransportException: Expected response code 250 but got an empty response intermittently in queue worker using Amazon SES


We're experiencing frequent Swift_TransportException: Expected response code 250 but got an empty response from queued jobs when sending mail using Amazon SES and just using the SMTP driver. This happens intermittently but appears to cause all our mails to stop sending once the initial error occurs - restarting the queue worker fixes it but inevitably the problem re-occurs intermittently.

Some things we've considered:

Note: searched other issues here on Stackoverflow but they were more in relation to config errors - this issue isn't about mails straight up not working, it's instead about intermittent failure.


Solution

  • So it seems like we may have been sending mail too fast on Amazon SES which caused some kind of intermittent connection problem - we have fixed this by adding a delay to our queued jobs:

    php artisan queue:work --sleep=3 --tries=3 --delay=30
    

    The delay is used whenever a job fails it will attempt to retry again after the delay - so in the above command when a job fails it will retry in 30secs. In the meantime, other jobs can still process from the queue and once the 30secs is up it'll pull that job again and retry it.

    By default Laravel has a delay of 0 which means as soon as a job fails it will instantly attempt to retry it which was likely what was causing problems, as in most situations you would ideally want a bit of a grace period for retrying failed jobs.

    In addition to this, we've also setup a little event which listens for whenever an exception is thrown in regards to this SwiftMailer exception which will report the exception, wait a bit and then restart the queue worker (gracefully) which means the next time a job is attempted to be picked off the queue it will restart the whole process - although this hasn't fired yet for us, it may prove useful if you are experiencing intermittent connection problems with long running processes.

    // Added in `AppServiceProvider` under `boot` function
    if ($this->app->runningInConsole()) {
        $this->app['queue']->failing(function (\Illuminate\Queue\Events\JobFailed $event) {
            if (strpos($event->exception, 'response code 250')) {
                report(new \Exception('Got swift 250 error, restarting queue.'));
    
                sleep(5);
                \Artisan::call('queue:restart');
            }
        });
    }