In general we have around 2 requests / second. However, after we pushed notification to 3000 users, we suddenly get to 120 requests / second. Unfortunately around half of those users were getting 5XX server errors, meaning half of the users who came up were getting blank pages. After the hype is gone, no server error ever happened again.
I did some research and it seems like it is because of the start up time, that is was taking too long for the instance to start up and therefore aborted. I checked my instance number, there were as many as 90 instances created, but active instances dropped from 40 to 0 after a second. This problem only occurred when there was a sudden increase of request, but I thought app engine was supposed to be able to handle this type of increase.
My question is how can I fix this problem? Or where should I keep digging to find the root of the problem. Thanks in advance!
Thank you all for the help, I've figured out the problem.
Credit goes to Dan Cornilescu, his comments gave me the leads to find the root of the problem, which was because I did not have enough min_idle_instances. Once I had enough number of min_idle_instance set in my auto scaling section in my app.yaml I did not receive any 5XX server errors.