Can SageMaker Training have training data in NVMe volumes on compatible instances? (eg G4dn and P3dn). If so, if there an appropriate way to programmatically access that data?
Yes on all nitro-backed instances EBS volumes that are exposed as NVMe block devices.
In the Sagemaker Python SDK, you can specify the volume_size
of the SM_TRAINING_CHANNEL
path - the EBS (NVMe backed) will be in that path and when you go to actually run you pass the --train_dir
path to your code.
Code example below:
def main(aws_region,s3_location,instance_cout):
estimator = TensorFlow(
train_instance_type='ml.p3.16xlarge',
**train_volume_size=200,**
train_instance_count=int(instance_count),
framework_version='2.2',
py_version='py3',
image_name="231748552833.dkr.ecr.%s.amazonaws.com/sage-py3-tf-hvd:latest"%aws_region,
And then in your entry script
train_dir = os.environ.get('SM_CHANNEL_TRAIN')
subprocess.call(['python','-W ignore', 'deep-learning-models/legacy/models/resnet/tensorflow2/train_tf2_resnet.py', \
"--data_dir=%s"%train_dir, \