Im looking into to use a Load Balancer in front of our API management, for example if a 1000 requests coming in in 5 second i want the 1001th request to be denied.
Which solution would work in this case?
API Management you can use Rate Limit policies:
The rate-limit-by-key policy prevents API usage spikes on a per key basis by limiting the call rate to a specified number per a specified time period.
The key can have an arbitrary string value and is typically provided using a policy expression. Optional increment condition can be added to specify which requests should be counted towards the limit. When this call rate is exceeded, the caller receives a 429 Too Many Requests response status code.
Example:
<policies>
<inbound>
<base />
<rate-limit-by-key calls="10"
renewal-period="60"
increment-condition="@(context.Response.StatusCode == 200)"
counter-key="@(context.Request.IpAddress)"
remaining-calls-variable-name="remainingCallsPerIP"/>
</inbound>
<outbound>
<base />
</outbound>
</policies>