amazon-web-servicesamazon-sagemaker

AWS SageMaker - What is Channel?


What is exactly channel in AWS SageMaker?

SageMaker Documentation - Channel

A channel is a named input source that training algorithms can consume.

Input Data Configuration

For example, suppose that you specify three data channels (train, evaluation, and validation) in your request.

Is it a location where data for either training or validation or test is stored? Or a mechanism such as UNIX pipe by which data is feed into SageMaker runtime for consumption?

Where is the term defined?


Solution

  • The Channel in the Sage Maker, is the source of the Input data for a training a machine learning model. Channels define how training data is fed into a SageMaker training job.

    Lets Take the an example, You have two input channel (train and test): You will have 2 Environment variable

    Check this document for the variables of SM: https://github.com/aws/sagemaker-training-toolkit/blob/master/ENVIRONMENT_VARIABLES.md

    Each channel specifies the following:

    Ref Doc: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_Channel.html

    Example Config for Channel:

    training_channel = {
        "ChannelName": "train",
        "DataSource": {
            "S3DataSource": {
                "S3DataType": "S3Prefix",
                "S3Uri": "s3://your-bucket/train-data/",
                "S3DataDistributionType": "FullyReplicated",
            }
        },
        "ContentType": "text/csv",
        "CompressionType": "None",
        "InputMode": "File"
    }
    

    In short: The Channel is a way to define the source or path for the input data in Sage Maker