c++distributed-computingactorc++-actor-framework

How to "publish" a large number of actors in CAF?


I've just learned about CAF, the C++ Actor Framework.

The one thing that surprised me is that the way to make an actor available over the network is to "publish" it to a specific TCP port.

This basically means that the number of actors that you can publish is limited by the number of ports you have ( 64k ). Since you need both one port to publish an actor and one port to access a remote actor, I assume that two processes would each be able to share at best about 32k actors each, while they could probably each hold a million actors on a commodity server. This would be even worse, if the cluster had, say, 10 nodes.

To make the publishing scalable, each process should only need to open 1 port, for each and every actor in one system, and open 1 connection to each actor system that they want to access.

Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?


Solution

  • Let me add some background. The middleman::publish/middleman::remote_actor function pair does two things: connecting two CAF instances and giving you a handle for communicating to a remote actor. The actor you "publish" to a given port is meant to act as an entry point. This is a convenient rendezvous point, nothing more.

    All you need to communicate between two actors is a handle. Of course you need to somehow learn new handles if you want to talk to more actors. The remote_actor function is simply a convenient way to implement a rendezvous between two actors. However, after you learn the handle you can freely pass it around in your distributed system. Actor handles are network transparent.

    Also, CAF will always maintain a single TCP connection between two actor system. If you publish 10 actors on host A and "connect" to all 10 actors from host B via remote_actor, you'll see that CAF will initially open 10 connections (because the target node could run multiple actor system) but all but one connection will get closed.

    If you don't care about the rendezvous for actors offered by publish/remote_actor then you can also use middleman::open and middleman::connect instead. This will only connect two CAF instances without exchanging actor handles. Instead, connect will return a node_id on success. This is all you need for some features. For example remote spawning of actors.

    Is there a way to publish one actor as a proxy for all actors in an actor system ( preferably without any significant performance loss )?

    You can publish one actor at a port that's sole purpose it is to model a rendezvous point. If that actor sends 1000 more actor handles to a remote actor this will not cause any additional network connections.

    Writing a custom actor that explicitly models the rendezvous between multiple systems by offering some sort dictionary is the recommended way.

    Just for the sake of completeness: CAF also has a registry mechanism. However, keys are limited to atom values, i.e., 10-characters-or-less. Since the registry is generic it also only stores strong_actor_ptr and leaves type safety to you. However, if that's all you need: you put handles to the registry (see actor_system::registry) and then access this registry remotely via middleman::remote_lookup (you only need a node_id to do this).