Consider this:
main:
steps:
- init:
assign:
- shared_var: []
- derp:
parallel:
shared: [shared_var]
for:
value: item
in: <SOMETHING TO LOOP OVER>
steps:
- do_something:
call: http.post
args:
url: https://<CLOUD FUNCTION>
result: func_output
- add:
assign:
- shared_var: ${list.concat(shared_var, func_output["body"])}
I've been under the assumption that concurrent updates to shared_var
here are thread safe is that correct? I see nothing in the GCP docs regarding this though.
shared_var
will always be updated concurrently and I don't have to worry about race conditions right? Is there any information on how workflows implement this with locking or something?
From this doc1 “As shared variables are atomic” and this doc2 “All assignments in parallel steps are atomic”.
By definition of atomic:
“Making the operation atomic consists in using synchronization mechanisms in order to make sure that the operation is seen, from any other thread, as a single, atomic (i.e. not splittable in parts), operation. That means that any other thread, once the operation is made atomic, will either see the value of foo
before the assignment, or after the assignment. But never the intermediate value.”
It means that the updates to those variables are handled in a way that ensures there are no partial or inconsistent updates. Each update to the shared variable will be applied completely and atomically, without interference from other parallel executions.
This means that when multiple parallel steps try to update a shared variable at the same time, those updates are either fully applied or not applied at all, preventing race conditions and ensuring data consistency.
Checking relevant docs and seeing that no similar issues were raised to the public issue tracker led me to believe (also agreeing with @guillaume blaquiere) that shared variables are thread safe. Although, as initially observed, this is not explicitly mentioned in the public docs but can be inferred by the above statements.