Whenever I deploy my project with app_cfg.py, there's a 3-4 second period of time where our project returns a 404 not found error. I thought this could be avoided by having multiple machines up and running. In the cloud console, it shows only one instance running all the time, even though I set min_idle_instances to three.
How can I avoid 404'ing the server during deployment?
Below is part of the app.yaml file
instance_class: F4
automatic_scaling:
min_idle_instances: 3
max_idle_instances: 6
min_pending_latency: 30ms # default value
max_pending_latency: automatic
max_concurrent_requests: 40
I see 2 possible explanations:
if you're deploying the same service/app version as the one already carrying the traffic: you're effectively re-writing the app code, so GAE will stop all instances and start new ones. While this happens the app won't work. There's also risk of extended downtime, see Continuous integration/deployment/delivery on Google App Engine, too risky?
even if you're deploying a different version but you immediately switch 100% of the traffic to it and the traffic is high - the autoscaler needs some time to analyze the traffic pattern and spinup enough dynamic instances to handle it. See details in Use traffic migration or splitting when switching to a new default version.
I don't think that deploying using app_cfg.py
or gcloud app deploy
matters in either case.
Always deploying a new version and gradually switching traffic to it once it's confirmed it's running fine should address all these cases.
The idle instances can't help as they have to be spinned up with the new code. They only help during high peak transients anyways, see What does setting the automatic_scaling max_idle_instances to zero (0) do?