google-cloud-platformgoogle-genomicsnextflow

Additional 500 GB persistent disk attached by default


I am trying to run a workflow on GCP using Nextflow. The problem is, whenever an instance is created to run a process, it has two disks attached. The first boot-disk (default 10GB) and an additional 'google-pipelines-worker' disk (default 500GB). When I run multiple processes in parallel, multiple VM's are created and each has an additional disk attached of 500GB. Is there any way to customize the 500GB default?

nextflow.config

process {
    executor = 'google-pipelines'
}

cloud {
    driver = 'google'
}

google {
    project = 'my-project'
    zone = 'europe-west2-b'
}

main.nf

#!/usr/bin/env nextflow

barcodes = Channel.from(params.analysis_cfg.barcodes.keySet())

process run_pbb{
    machineType: n1-standard-2
    container: eu.gcr.io/my-project/container-1

    output:
    file 'this.txt' into barcodes_ch

    script:
    """
    sleep 500
    """
}

The code provided is jus a sample. Basically, this will create a VM instance with an additional 500GB standard persistent disk attached to it.


Solution

  • Nextflow updated this in the previous release, will leave this here.

    First run export NXF_VER=19.09.0-edge

    Then in the scope 'process' you can declare a disk directive like so:

    process this_process{
        disk "100GB"
    }
    

    This updates the attached persistent disk (default: 500GB)

    There is still no functionality to edit the size of the boot disk (default: 10GB)