pythonmachine-learninggoogle-cloud-platformgcp-ai-platform-training

Google Cloud Function to run same training job with different user args in paralell


I'm trying to find a way to run the same training job with different user arguments in parallel. I need to run MYMODULE for each one of the product_X in parallel. MYMODULE reads user_arg, but I need that each time it reads it, it reads a different product, and to launch them in parallel.I know one solution could be run three blocks of training code separately, but maybe there's a function to do it directly with one.

!gcloud ai-platform jobs submit training demo_training_$(date +"%Y%m%d_%H%M%S") \
    --staging-bucket gs: MYBUCKET\
    --package-path MYPATH \
    --module-name MYMODULE \
    --region MYREGION \
    --runtime-version=2.1 \
    --python-version=3.7 \
    --scale-tier BASIC \
    -- \
    --model-bucket MYBUCKET \
    --output-dir MYDIR \
    --user_arg product_1, product_2, product_3 \

MYMODULE looks like this, and I need to read a different product on each run.

def train_and_evaluate (args):
    ...code

    product = args.user_arg

    ...more code

Solution

  • There isn't client library for this type of call. But, at google, all is API. Therefore, you can perform a direct API call to run your jobs.

    You can call the AI Platform job creation API to create your job. You need to specify a training job body and in it you have a training job input. In this object, you have the list of args that you provide to your training job.

    If you want an example of the JSON to populate, run your gcloud command with the option --log-http and the API call will be printed on the console.


    When you perform your API call, you need to perform and authenticated call. You can use the Google OAuth2 library to get the credential. Then, generate an access token that you add as Authorization: Bearer header. Don't forget also to add the content-type in the header (as application/json of course)