pythontensorflowgpugoogle-colaboratorydistributed-training

Distributed training over local gpu and colab gpu


I want to fine tune ALBERT.

I see one can distribute neural net training over multiple gpus using tensorflow: https://www.tensorflow.org/guide/distributed_training

I was wondering if it's possible to distribute fine-tuning across both my laptop's gpu and a colab gpu?


Solution

  • I don't think that's possible. Because in order to do GPU distributed training, you need NVLinks among your GPUs. You don't have such a link between your laptop's GPU and Colab GPUs. This is a good read https://lambdalabs.com/blog/introduction-multi-gpu-multi-node-distributed-training-nccl-2-0/