google-app-engine-pythongoogle-cloud-endpoints-v2

Cloud Endpoints Framework Version 2 inconsistent and high unexplainable latency


I recently migrated my App from version 1 of Cloud Endpoints Frameworks to version 2 (App Engine python standard). Supposedly one of the benefits is reduced request latency. Ignoring warm up and/or starting the backend instance, it seems I was getting unexplained latency outside observable logs/app stats, of anywhere from 300ms to 2 seconds. After observing this for about a week I finally just went back to the echo example and deployed to a test app and noticed the exact same behavior.

Echo example: https://cloud.google.com/endpoints/docs/frameworks/python/get-started-frameworks-python

Here is the relevant instance settings in my app.yaml:

runtime: python27
env: standard
threadsafe: true
instance_class: B2
basic_scaling:
    idle_timeout: 900s
    max_instances: 2

Here are 3 requests to the api via curl spaced multiple seconds apart. The test app should not be doing anything else:

MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
 "content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.019
time_appconnect: 0.152
time_pretransfer: 0.152
time_redirect: 0.000
time_starttransfer: 0.263
--------
time_total: 0.263

MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
 "content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.020
time_appconnect: 0.144
time_pretransfer: 0.144
time_redirect: 0.000
time_starttransfer: 0.613
--------
time_total: 0.613

MacBook-Pro$ curl -H "Content-Type: application/json" -X POST -d '{"content":"hello world"}' https://testtictactoe-164905.appspot.com/_ah/api/echo/v1/echo -w "@curl-format.txt"
{
 "content": "hello world"
}
time_namelookup: 0.005
time_connect: 0.021
time_appconnect: 0.145
time_pretransfer: 0.145
time_redirect: 0.000
time_starttransfer: 1.028
--------
time_total: 1.028

Here is the app stats for the first 0.263 second request (Grand Total 48ms): enter image description here Here is part of the log entry for 0.263 second request: enter image description here

Here is the app stats for the 1.028 second request (Grand Total 504ms):

App stats for 1.028 second request Here is part of the log entry for the 1.028 second request enter image description here

Here is the instance latency for the last hour:

enter image description here

  1. Why is the actual latency of my request so much higher than what is shown by the logs and app stats?
  2. Is this extra overhead latency of up to 600ms, which seems completely out of my control, just an expected part of making google cloud endpoints requests?
  3. Why does this latency fluctuate so much when dealing with just one instance and very few incoming requests?

Solution

  • Getting rid of API management did the trick. I tried this based on the answer from @Codiak to this question: stackoverflow.com/questions/45585413/

    Once I made this change latency was much more consistent and dramatically improved.

    To accomplish this I basically removed these two lines from app.yaml:

    ENDPOINTS_SERVICE_NAME:

    ENDPOINTS_SERVICE_VERSION: 2018-05-05r0

    I also removed [apiname]openapi.json from my project. This also means I no longer have to go through many of the steps described here:

    1. https://cloud.google.com/endpoints/docs/frameworks/python/get-started-frameworks-python#deploy_configuration
    2. https://cloud.google.com/endpoints/docs/frameworks/python/get-started-frameworks-python#deploy_backend