ramazon-web-servicesparallel-processingr-pawsamazon-parallelcluster

How to run multi-node parallel job with AWS ParallelCluster in R


I am interested in running a multi-node parallel job with AWS ParallelCluster in R. Is there any useful documentation, guide or R package helping with it? As far as I understand library(paws) does not support that service. Thank you.


Solution

  • It looks like the PAWS library is a client for individual services in AWS. However, AWS ParallelCluster is actually a downloadable Python package that helps you orchestrate multiple AWS services together into a Slurm-powered, dynamically scaling HPC cluster in the cloud.

    Once you've configured your cloud HPC system using ParallelCluster, you can log into it using SSH or AWS Systems Manager and interact with it like any other Slurm cluster you might have experience with.

    At a high-level, your roadmap looks like this:

    1. Install ParallelCluster
    2. Design and configure your HPC cluster
    3. Log into your cluster and install R in the shared $HOME directory
    4. Run your multi-node parallel R job using a Slurm batch script