flaskgoogle-app-enginesingle-sign-onsamlddos

Can I avoid Google App Engine instances from being undesirably created?


I am currently running a Flask web application (Dash to be more precise) on Google App Engine (Standard Environment) as a service. For the moment I have a custom login/logout page and I am handling user sessions with Flask-Login. I am using Flask-Login to serve content based on the currently authenticated user as well as maintaining the current user's session active.

My App Engine service is configured to automatically scale up and down based on traffic and I would like to avoid instances being undesirably created (for instances by DDOS attacks). I already know that:

App Engine sits behind the Google Front End which mitigates and absorbs many Layer 4 and below attacks, [...]

as per an official documentation from Google: link. Moreover, I am aware, again citing the same documentation, that:

Currently, [Google Compute Engine API] projects are limited to an API rate limit of 20 requests/second.

which can mitigate to some extent a DDOS attack (App Engine actually uses Google Compute Engine unless I am mistaken).

I am looking for a solution that would involve a third-party (or Google) application that would act as a middleman between the user and the application. It would basically handle the sign-in part and redirect the user to my web application in case of successful login while protecting my website from being accessed by mistake (e.g. crawlers) and thus avoiding my instances from being created.

Does such an application exist? I am looking into SSO providers that support a login/logout protocol such as SAML or OpenID Connect (Firebase is a good candidate for instance) but I am unsure if this solution would avoid my instances from being created undesirably. Finally, I also do not want to have to whitelist users based on their IP address.


Solution

    1. You're basically looking for a solution that will block 'certain traffic' like bots, crawlers, bad actors from reaching your website.

    2. See if this SO response helps. Note that both Cloud Load Balancing and Google Cloud Armor mentioned in that solution are paid services.

    3. Google App Engine solution for this is Firewall (it's free). You specify IPs or range of IPs and an action (ALLOW or DENY). Google cloud will match all incoming traffic against the rules. If it finds a match and if the action is DENY, the traffic will be dropped and it won't reach your instance (this means new instances won't be created to serve such traffic). This is essentially a blacklist rather than a whitelist which you mentioned.

      Note: Every new Google App Engine has a default firewall rule which allows all traffic

    4. I assume Google already blocks known list of bad actors (this is just an assumption on my part) and leaves you to handle the rest via Firewall rules. The current design of firewall requires you to manually go through your logs and identify traffic that are bots/spam/crawlers and then manually create a firewall rule against those IPs. The other challenge is that these bots/spam/crawlers frequently change their IPs too. But you might be lucky and you only have a small set of bad actors visiting your site and so can quickly create firewall rules blocking these IPs.

    5. You should search for solutions that allow you to automate the previous step. We also face this challenge for our site and built a solution that allows us to semi-automate this. We're currently working on a desktop app that will allow for a full automation (you set a schedule, it will parse your logs, identify spam/bot/etc, create the firewall rule). If you'd like to be notified when this is ready, you can sign up here

    PS: A trick I sometimes use is to put key parts of my service behind a non-intuitive path. This way, the spam/bots will only hit the default/base url & common paths and then I have a separate service which returns 404 for all calls to those urls. This is at best just a band-aid.