job-schedulingray

Can I force Ray task/actors to run on specific Nodes?


I know that a group of tasks in ray will connect with the same actor/s and this will cause a lot of IO between the actor/s and the tasks.
I want to know if is there a way to force the actor/s and the tasks to run on the same Node, to optimize the IO connection.


Solution

  • For Ray 1.13, you can use ray.util.scheduling_strategies.NodeAffinitySchedulingStrategy(node_id, soft: bool),

    Here is a simple example:

    @ray.remote
    class Actor:
        pass
    
    # "DEFAULT" scheduling strategy is used (packed onto nodes until reaching a threshold and then spread).
    a1 = Actor.remote()
    
    # Zero-CPU (and no other resources) actors are randomly assigned to nodes.
    a2 = Actor.options(num_cpus=0).remote()
    
    # Only run the actor on the local node.
    a3 = Actor.options(
        scheduling_strategy=NodeAffinitySchedulingStrategy(
            node_id = ray.get_runtime_context().node_id,
            soft = False,
        )
    ).remote()