I recently migrated my App from version 1 of Cloud Endpoints Frameworks to version 2 (App Engine python standard). Supposedly one of the benefits is reduced request latency. Ignoring warm up and/or starting the backend instance, it seems I was getting unexplained latency outside observable logs/app stats, of anywhere from 300ms to 2 seconds. After observing this for about a week I finally just went back to the echo example and deployed to a test app and noticed the exact same behavior.
Echo example: https://cloud.google.com/endpoints/docs/frameworks/python/get-started-frameworks-python
Here is the relevant instance settings in my app.yaml:
runtime: python27
env: standard
threadsafe: true
instance_class: B2
basic_scaling:
idle_timeout: 900s
max_instances: 2
Here are 3 requests to the api via curl spaced multiple seconds apart. The test app should not be doing anything else:
MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
"content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.019
time_appconnect: 0.152
time_pretransfer: 0.152
time_redirect: 0.000
time_starttransfer: 0.263
--------
time_total: 0.263
MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
"content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.020
time_appconnect: 0.144
time_pretransfer: 0.144
time_redirect: 0.000
time_starttransfer: 0.613
--------
time_total: 0.613
MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
"content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.021
time_appconnect: 0.145
time_pretransfer: 0.145
time_redirect: 0.000
time_starttransfer: 1.028
--------
time_total: 1.028
Here is the app stats for the first 0.263 second request (Grand Total 48ms): Here is part of the log entry for 0.263 second request:
Here is the app stats for the 1.028 second request (Grand Total 504ms):
Here is part of the log entry for the 1.028 second request
Here is the instance latency for the last hour:
Getting rid of API management did the trick. I tried this based on the answer from @Codiak to this question: stackoverflow.com/questions/45585413/
Once I made this change latency was much more consistent and dramatically improved.
To accomplish this I basically removed these two lines from app.yaml:
ENDPOINTS_SERVICE_NAME:
ENDPOINTS_SERVICE_VERSION: 2018-05-05r0
I also removed [apiname]openapi.json from my project. This also means I no longer have to go through many of the steps described here: