I can't find any documentation on these two terms. I poured over AWS docs and Google results.
What is the difference between burst limit and rate limit? When I go to change the settings for default route throttling on my API, there are just two number inputs. It doesn't say what unit or time frame these numbers represent. Is it API calls per second? per minute?
The burst limit defines the number of requests your API can handle concurrently. The rate limit defines the number of allowed requests per second. This is an implementation of the Token bucket implementation.
Concurrently means that requests run in parallel. Assuming that one request takes 10ms, you could have 100 request per second with a concurrency of 1, if they were all executed in series. But if they were all executed at the same moment, the concurrency would be 100. In both cases a rate limit of 100 would suffice. In the first case, a burst limit of 1 would allow all requests to succeed, in the second case this would deny 99 requests.
The official documentation only mentions the Token bucket algorithm briefly.