unsupervised-learningdistributed-trainingamazon-machine-learningamz-sagemaker-distributed-training

Distributed Unsupervised Learning in SageMaker


I am running local unsupervised learning (predominantly clustering) on a large, single node with GPU.

Does SageMaker support distributed unsupervised learning using clustering?

If yes, please provide the relevant example (preferably non-TensorFlow).


Solution

  • SageMaker Training allow you to bring your own training scripts, and supports various forms of distributed training, like data/model parallel, and frameworks like PyTorch DDP, Horovod, DeepSpeed, etc. Additionally, if you want to bring your data, but not code, SageMaker training offers various unsupervised built-in algorithms, some of which are parallelizable.