I am working on a project that is using Openwhisk. I have created a Kubernetes cluster on Google cloud with 5 nodes and installed OW on it. My serverless function is written in Java. It does some processing based on arguments I pass to it. The processing can last up to 30 seconds and I invoke the function multiple times during these 30 seconds which means I want to have a greater number of runtime containers(pods) created without having to wait for the previous invocation to finish. Ideally, there should be a container for each invocation until the resources are finished.
Now, what happens is that when I start invoking the function, the first container is created, and then after few seconds, another one to serve the first two invocation. From that point on, I continue invoking the function (no more than 5 simultaneous invocation) but no containers are started. Then, after some time, a third container is created and sometimes, but rarely, a fourth one, but only after long time. What is even weirded is that the containers are all started on a single cluster node or sometimes on two nodes (always the same two nodes). The other nodes are not used. I have set up the cluster carefully. Each node is labeled as invoker. I have tried experimenting with memory assigned to each container, max number of containers, I have increased the max number of invocations I can have per minute but despite all this, I haven't been able to increase the number of containers created. Additionally, I have tried with different machines used for the cluster (different number of cores and memory) but it was in vain.
Since Openwhisk is still relatively a young project, I don't get enough information from the official documentation unfortunately. Can someone explain how does Openwhisk decide when to start a new container? What parameters can I change in values.yaml such that I achieve greater number of containers?
The reason why very few containers were created is the fact that worker nodes do not have Docker Java runtime image and that it needs be downloaded on each of the nodes the first this environment is requested. This image weights a few hundred MBs and it needs time to be downloaded (a couple of seconds in google cluster). I don't know why Openwhisk controller decided to wait for already created pods to be available instead of downloading the image on other nodes. Anyway, once I downloaded the image manually on each of the nodes, using the same application with the same request rate, a new pod was created for each request that could not be served with an existing pod.