google-cloud-platformgoogle-cloud-pubsubevent-stream

How to stream events with GCP platform?


I am looking into building a simple solution where producer services push events to a message queue and then have a streaming service make those available through gRPC streaming API.

Cloud Pub/Sub seems well suited for the job however scaling the streaming service means that each copy of that service would need to create its own subscription and delete it before scaling down and that seems unnecessarily complicated and not what the platform was intended for.

On the other hand Kafka seems to work well for something like this but I'd like to avoid having to manage the underlying platform itself and instead leverage the cloud infrastructure.

I should also mention that the reason for having a streaming API is to allow for streaming towards a frontend (who may not have access to the underlying infrastructure)

Is there a better way to go about doing something like this with the GCP platform without going the route of deploying and managing my own infrastructure?


Solution

  • If you essentially want ephemeral subscriptions, then there are a few things you can set on the Subscription object when you create a subscription:

    1. Set the expiration_policy to a smaller duration. When a subscriber is not receiving messages for that time period, the subscription will be deleted. The tradeoff is that if your subscriber is down due to a transient issue that lasts longer than this period, then the subscription will be deleted. By default, the expiration is 31 days. You can set this as low as 1 day. For pull subscribers, the subscribers simply need to stop issuing requests to Cloud Pub/Sub for the timer on their expiration to start. For push subscriptions, the timer starts based on when no messages are successfully delivered to the endpoint. Therefore, if no messages are published or if the endpoint is returning an error for all pushed messages, the timer is in effect.

    2. Reduce the value of message_retention_duration. This is the time period for which messages are kept in the event a subscriber is not receiving messages and acking them. By default, this is 7 days. You can set it as low as 10 minutes. The tradeoff is that if your subscriber disconnects or gets behind in processing messages by more than this duration, messages older than that will be deleted and the subscriber will not see them.

    Subscribers that cleanly shut down could probably just call DeleteSubscription themselves so that the subscription goes away immediately, but for ones that shut down unexpectedly, setting these two properties will minimize the time for which the subscription continues to exist and the number of messages (that will never get delivered) that will be retained.

    Keep in mind that Cloud Pub/Sub quotas limit one to 10,000 subscriptions per topic and per project. Therefore, if a lot of subscriptions are created and either active or not cleaned up (manually, or automatically after expiration_policy's ttl has passed), then new subscriptions may not be able to be created.