google-cloud-platformgcloudautoscalinggoogle-cloud-billingrendertron

How to deploy a simple app to GCP with minimal costs (or how to disable autoscaling after deploy)?


In my first attempt at using Cloud to deploy an app...

The problem: GCP (Google Cloud Platform) unexpected instance hour usage (Frontend Instance Hours). High traffic was not the issue but for some reason a bunch of "instances" and "versions" were created by their autoscaling feature.

Solution they suggested: Disable autoscaling and stop serving previously deployed versions of your instance. I still need one version/instance running but through their console I still have not found where it shows how many versions/instances I have running or where to stop them (also verifying that at least 1 instance is still working in order to not break my app)

My app is simple app that was developed by Google developers and recommended by them for dynamic rendering a JS SPA (allows search engines and crawlers to see fully rendered html).
My actual website together with a node app to point to GCP for crawlers is hosted else where (on Godaddy) and both are working together nicely.

The app I deployed to GCP is called Rendertron (https://github.com/GoogleChrome/rendertron)

Google also recommends deploying to GCP (most documentation covers that form of deployment). I attempted deploying to my Godaddy shared hosting and it was not straight forward and easy to make work so I simply attempted creating a GCP project and tried deploying there. All worked great!

After deploying the app to GCP that has almost no traffic yet, I expected zero costs or at most something under a dollar.

Unfortunately, I received a bill for more than $150 for the month with approx the same projected for the next month.

Without paying an addition $150 for tech support, I was able to contact GCP billing over the phone and they are great in that they are willing to reimburse the charges but only after I resolve the problem myself.

They are generous with throwing a group of document links at you (common causes of unexpected instance hour usage) but can't help further than that.

After many google searches, reading through documentation, paying for and watching gcloud tutorials through pluralsight.com, the direction I have understood or not understood so far is as follows:

I can use a direction to continue my attempt of investigating how to resolve the issue.

  1. The direction of me needing to create a Group Instance (so I can manage the no autoscaling from there) is the way to go and where I should focus my attempts?

  2. The direction of continuing learning how to simply update my config in the .yaml file to create no scaling, for example something like setting both min_instances and max_instances to 1 together with learning how to manually stop (directly from GCP console) more than 1 instance/version that are currently running is where I should focus on?

  3. A third option?

As a side note, autoscaling with GCP does not seem very intelligent.
Why would my app that has almost no traffic run into an issue that multiple instances were created?

Any insight will be greatly appreciated.


**** Update **** platform info

My app is deployed to Google App Engine (GAE) (deployed code, not a container)

Steps taken for Deploy:

git clone https://github.com/GoogleChrome/rendertron.git
cd rendertron
npm install && npm run build
gcloud app deploy app.yaml --project MY_PROJECT_ID

I simply followed the steps above and my app has been working great, and have not touched a thing since deployment.

The config (app.yaml) originaly deployed was:
(which I made no changes to from the Rendertron repo)

runtime: nodejs12
instance_class: F4_1G
automatic_scaling:
  min_instances: 1
env_variables:
  DISABLE_LEGACY_METADATA_SERVER_ENDPOINTS: "true"

-- Google Cloud Console Info

under App Engine --> Versions
There is 1 item listed with the following values:

Instances: 1
Runtime: nodejs12

Environment: Standard

Size: 392.7 MB
Deployed: Feb 23, 2021
Config:
  runtime: nodejs12
  env: standard
  instance_class: F4_1G
  handlers:
    url: .*
    script: auto
  env_variables:
    DISABLE_LEGACY_METADATA_SERVER_ENDPOINTS: 'true'
  automatic_scaling:
    min_idle_instances: automatic
    max_idle_instances: automatic
    min_pending_latency: automatic
    max_pending_latency: automatic
    min_instances: 1
  network: {}


**** Solution ****
I uploaded a new app.yaml file and changed: min_instances: 1 to max_instances: 1 (had to redeploy the entire project with an updated app.yaml)

At first I also changed "instance_class" from F4_1G to F1 to save money, but I was getting an error in my app that there was not enough memory and my app crashed with a 500 server error. (The rendertron app came up but crashed when trying to render something) I updated it again back to F4_1G and the app seems to work properly.

If I see charges again in the future when my traffic goes up, I will check if there is an instance class between F1 to F4_1G that could be enough memory for my app to work but accumulate the minimum charges possible.

Below you could see that when I made the change on Friday and until the following Sunday the costs dropped to 0 but the app is still running properly:
Screenshot showing GCP billing report costs dropped after change
**** Solution ****


Solution

  • The rendertron repo suggests using App Engine standard (app.yaml) and so I assume that's what you're using.

    If you are using App Engine standard then:

    There are at least 2 critical variables with App Engine standard: the size of the App Engine instances you're using and the number of them:

    1. You may wish to use a (cheaper) instance class (link).
    2. You can max_instances: 1 to limit the number of instances (link).

    It appears your bandwidth use is low (and will be constrained by the above to a large extent) but bear this in mind too, as well as the fact that...

    Your app is likely exposed on the public Internet and so could quite easily be consuming traffic from scrapers and other "actors" who stumble upon your endpoint and GET it.

    As you've seen, it's quite easy to over-consume (cloud-based) resources and face larger-than-anticipated bills. There are some controls in GCP that permit you to monitor (not necessarily quench) big bills (link).

    The only real solution is to become as familiar as you can with the platform and how its resources are priced.

    Update #1

    My preference is to use gcloud (CLI) for managing services but I think your preference is the Console.

    When you deploy an "app" to App Engine, it comprises >=1 services (default). I've deployed the simplest, "Hello World!" app comprising a single default service (Node.JS):

    Services

    https://console.cloud.google.com/appengine/services?serviceId=default&project=[[YOUR-PROJECT-ID]]

    I deployed it multiple (3) times as if I were evolving the app. On the "Versions" page, 3 versions are listed:

    Versions

    https://console.cloud.google.com/appengine/versions?serviceId=default&project=[[YOUR-PROJECT-ID]]

    NOTE There are multiple versions stored on the platform but only the latest is serving (and 100% of) traffic. IIRC App Engine standard does not charge to store multiple versions.

    I tweaked the configuration (app.yaml) to specify instance_class (F1) and to limit max_instances: 1:

    app.yaml:

    runtime: nodejs14
    instance_class: F1
    automatic_scaling:
      max_instances: 1
    

    And, this is reflected in the deployed app's config:

    Config

    Update #2

    If you can encourage someone to write a Dockerfile and contribute it to the rendertron repo, you could then deploy the container to various alternative services (both Google and non-Google).

    A curious fact with App Engine standard is that, while you deploy 'code' to the platform, it creates a container image from your artifacts and this is what gets deployed to App Engine. You can prove this to yourself by viewing the Container Registry (service) in your project:

    Container Registry

    https://console.cloud.google.com/gcr/images/dazwilkin-210503-67357098?project=[[YOUR-PROJECT-ID]]

    And, if you wish, you could reuse that image elsewhere.

    Google Cloud Run is probably your best option on Google. Cloud Run permits you both to restrict the number of instances you run and you can more easily limit access to the deployed app to authenticated users.

    With a container, you can deploy rendertron anywhere that runs container as-a-service.