google-cloud-platformgoogle-cloud-storageslurmgcsfuse

Slurm cluster in Google cloud: Data in mounted directory in controller/login node not available in compute nodes


I have created a slurm cluster following this tutorial. I have also created a data bucket that stores some data that needs to be accessed in the compute nodes. Since the compute nodes share the home directory of the login node, I mounted the bucket in my login node using gcsfuse. However, if I execute a simple script test.py that prints the contents of mounted directory it is just empty. The folder is there as well as the python file.

Is there something that I have to specify in the yaml configuration file that enables having access to the mounted directory?

I have written down the steps that I have taken in order to mount the directory:

When creating the Slurm cluster using

gcloud deployment-manager deployments create google1 --config slurm-cluster.yaml

it is important that the node that should mount the storage directory has sufficient permissions. Ucnomment/add the following in the slurm-cluster.yaml file if your login node should mount the data. (Do the same just with the controller node instead if you prefer).

login_node_scopes          :
     - https://www.googleapis.com/auth/devstorage.read_write

Next, log into the log-in node and install gcsfuse. After having installed gcsfuse, you can mount the bucket using the following command

gcsfuse --implicit-dirs <BUCKET-NAME> target/folder/

Note, the service account which is being attached to your VM has to have access rights on the bucket. You can find the name of the service account in the details of your VM in the cloud console or by running the following command on the VM:

gcloud auth list

Solution

  • I've just got a similar setup working. I don't have a definite answer to why yours isn't, but a few notes:

    network_storage        :
    - server_ip: none
      remote_mount: mybucket
      local_mount: /data
      fs_type: gcsfuse
      mount_options: file_mode=664,dir_mode=775,allow_other