akka.netakka.net-clusterakka.net-networking

Akka.NET: Restrict child actor creation in akka.net cluster to a single machine


We have a particular scenario in our application - All the child actors in this application deals with huge volume of data (Around 50 - 200 MB). Due to this, we have decided to create the child actors in the same machine (worker process) in which parent actor was created.

Currently, this is achieved by the use of Roles. We also use .NET memory cache to transfer the data (Several MBs) between child actors.

Question : Is it ok to turn off clustering in the child actors to achieve the result we are expecting?

Edit: To be more specific, I have explained the our application setup in detail, below.

When we found out the network overhead caused by distributing the child actors across machines, we decided to restrict child actor creation to the corresponding machines which received the primary request, and distribute only the parent actor across machines.

While approaching an Akka.NET expert with this problem, we were advised to use "Roles" in order to restrict the child actor creation to a single machine in a cluster system. (E.g., Worker1Child, Worker2Child instead of "Child" role)

Question (Contd.) : I just want to know, if simply by disabling cluster option in child actors will achieve the same result; and is it a best practice to do so?

Please advise.


Solution

  • Sounds to me like you've been using a clustered pool router to remotely deploy worker actors across the cluster - you didn't explicitly mention this in your description, but that's what it sounds like.

    It also sounds like, what you're really trying to do here is take advantage of local affinity: have child worker actors for the same entities all work together inside the same process.

    Here's what I would recommend:

    1. Have all worker actors created as children of parents, locally, inside the same process, but either using something like the child-per-entity pattern or a LOCAL pool router.
    2. Distribute work between the worker nodes using a clustered group router, using roles etc.
    3. Any of the work in that high volume workload should all flow directly from parent to children, without needing to round-trip back and forth between the rest of the cluster.

    Given the information that you've provided here, this is as close to a "general" answer as I can provide - hope you find it helpful!