I have 10 requests per second of data I want to save that looks like the entry below. I need to save this data after a CloudRun function completes. (My infrastructure is on google-cloud-platform
). The data will be used as a data set for machine learning.
{
"text": "1k characters",
"text2": "1k characters",
"metadata1": "enum (100 vals)",
"metadata2": "number value"
}
I planned to save this as a non-awaited function to google-cloud-storage
either in one folder or in folders based on the metadata1 enum
. Is either better than the other?
Is this the appropriate route to take?
I think pubsub is overkill as suggested in this SO answer.
I can propose you 2 patterns, but in both case you need to store the messages: