springamazon-web-servicesspring-messagingexponential-backoff

Defining BackoffStrategy for SQS in AWS


I want to setup a backoff strategy for sqs in Spring application. What I did is :

    @Bean
    public ConnectionFactory sqsConnectionFactory() {

        PredefinedBackoffStrategies.ExponentialBackoffStrategy backoffStrategy = new PredefinedBackoffStrategies.ExponentialBackoffStrategy(3, 27);
        RetryPolicy retryPolicy = new RetryPolicy(PredefinedRetryPolicies.DEFAULT_RETRY_CONDITION, backoffStrategy, PredefinedRetryPolicies.DEFAULT_MAX_ERROR_RETRY, false);
        return SQSConnectionFactory.builder()
                .withRegion(Region.getRegion(Regions.fromName(region)))
                .withAWSCredentialsProvider(new DefaultAWSCredentialsProviderChain())
                .withClientConfiguration(new ClientConfiguration().withRetryPolicy(retryPolicy))
                .build();
    }

, but it has no effect. I read from SQS queue from simple @JmsListener method. In this method there is call to other api. This api returns me 404 error. Then there is a retry, but it's an instant retry. Why is that, how to properly configure this with exponential back off strategy ? It's retrying but not with exponential delay time.


Solution

  • The backoff strategy set in the ClientConfiguration in your code is used to provide delays for AWS Client's retries to connect to the AWS Services. It means that the strategy that you have set would be used if (say for some reason) the AWS SQS client fails to connect to the AWS SQS Service for fetching a message (or polling for new messages). If such a failure occurs the next attempt shall be made after the delay provided by the configured ExponentialBackoffStrategy. For more details refer the official documentation here.

    Reason for immediate retry

    For your case, message has already been fetched from the SQS Service by the underlying client (that is used by Spring's @JmsListener). A failure for this very step would have used the ExponentialBackoffStrategy. A failure after that (like the Exception being thrown after the 404) would trigger a failure acknowledgement to the SQS service and the service would make the message visible for consumption again immediately.

    How to associate the backoff strategy with for redelivery

    Sadly, that strategy cannot be associated with the message consumption failures. The delay that is desired is actually JMS 2.0 specification's redelivery-delay. But the SQS JMS provider you seem to be using is this one https://github.com/awslabs/amazon-sqs-java-messaging-lib which is JMS 1.1 implementation. Below is the same quoted from their documentation :

    This project builds on top of the AWS SDK for Java to use Amazon SQS as the JMS (as defined in 1.1 specification) provider

    Also, SQS doesn't have anything like redelivery-delay in their redrive-policy (only the Maximum Receives and Dead Letter Queue association). So, a possible workaround would be to handle failures on your own and setting message specific delays (more here) incrementally on every re-queue (this would include handling the retry-count in headers probably and not using JMS). Note that this may incur additional charges.

    On a side note : Adding delay to the queue or visibility timeout would not help in delays between failures while reading messages.