apibackendapigeethrottling

Apigee SpikeArrest Sync Across MessageProcessors (MPs)


Our organisation is currently migrating to Apigee.

I currently have a problem very similar to this one, but due to the fact that I am a Stack Overflow newcomer and have low reputation I couldn't comment on it: Apigee - SpikeArrest behavior

So, in our organisation we have 6 MessageProcessors (MP) and I assume they are working in a strictly round-robin manner.

Please see this config (It is applied to the TARGET ENDPOINT of the ApiProxy):

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
  <SpikeArrest async="false" continueOnError="false" enabled="true" name="spikearrest-1">
  <DisplayName>SpikeArrest-1</DisplayName>
  <FaultRules/>
  <Properties/>
  <Identifier ref="request.header.some-header-name"/>
  <MessageWeight ref="request.header.weight"/>
  <Rate>3pm</Rate>
</SpikeArrest>

I have a rate of 3pm, which means 1 hit each 20sec, calculated according to ApigeeDoc1.

The problem is that instead of 1 successful hit every 20sec I get 6 successful ones in the range of 20sec and then the SpikeArrest error, meaning it hit once each MP in a round robin manner.

This means I get 6 hit per 20 sec to my api backend instead of the desired 1 hit per 20sec.

Is there any way to sync the spikearrests across the MPs?

ConcurrentRatelimit doesn't seem to help.


Solution

  • SpikeArrest has no ability to be distributed across message processors. It is generally used for stopping large bursts of traffic, not controlling traffic at the levels you are suggesting (3 calls per minute). You generally put it in the Proxy Request Preflow and abort if the traffic is too high.

    The closest you can get to 3 per minute using SpikeArrest with your round robin message processors is 1 per minute, which would result in 6 calls per minute. You can only specify SpikeArrests as "n per second" or "n per minute", which does get converted to "1 per 1/n second" or "1 per 1/n minute" as you mentioned above.

    Do you really only support one call every 20 seconds on your backend? If you are trying to support one call every 20 seconds per user or app, then I suggest you try to accomplish this using the Quota policy. Quotas can share a counter across all message processors. You could also use quotas with all traffic (instead of per user or per app) by specifying a quota identifier that is a constant. You could allow 3 per minute, but they could all come in at the same time during that minute.

    If you are just trying to protect against overtaxing your backend, the ConcurrentRateLimit policy is often used.

    The last solution is to implement some custom code.


    Update to address further questions:

    Restating:

    To get the kind of granularity you are looking for, you'll need to use quotas. Unfortunately you can't set a quota to have a "per second" value on a distributed quota (distributed quota shares the count among message processors rather than having each message processor have its own counter). The best you can do is per minute, which in your case would be 300 calls per minute. Otherwise you can use a non-distributed quota (dividing the quota between the 6 message processors), but the issue you'll have there is that calls that land on some MPs will be rejected while others will be accepted, which can be confusing to your developers.

    For distributed quotas you'd set the 300 calls per minute in an API Product (see the docs), and assign that product to your four apps. Then, in your code, if that product is not assigned for the current API call's app, you'd use a quota that is hardcoded to 10 per second (600 per minute) and use a constant identifier rather than the client_id, so that all other traffic uses that quota.

    Quotas don't keep you from submitting all your requests nearly simultaneously, and I'm assuming your backend can't handle 1200+ requests all at the same time. You'll need to smooth the traffic using a SpikeArrest policy. You'll want to allow the maximum traffic through the SpikeArrest that your backend can handle. This will help protect against traffic spikes, but you'll probably get some traffic rejected that would normally be allowed by the Quota. The SpikeArrest policy should be checked before the Quota, so that rejected traffic is not counted against the app's quota.

    As you can probably see, configuring for situations like yours is more of an art than a science. My suggestion would be to do significant performance/load testing, and tune it until you find the correct values. If you can figure out how to use non-distributed quotas to get acceptable performance and predictability, that will let you work with per second numbers instead of per minute numbers, which will probably make massive spikes less likely.

    Good luck!