google-cloud-platformgoogle-cloud-vertex-aigcp-ai-platform-notebook

Proper implementation of post-startup script in Vertex AI Instances


When working with Vertex AI notebooks, I was interested in creating VMs that had persistent python computing environments. With the User-Managed Notebooks, I achieved this with a Docker-image. I am returning to this with the newer Instances. My Docker image doesn't seem to work with the Instances (will post a separate question), and so I resorted to trying to use a post-startup script to perform python module installations on-the-fly during boot.

I was previously successful in implementing a post-startup script following the guidance offered here, using option 2. I am returning to this process with the Vertex AI "Instances", and it is no longer working. Could someone please answer the following?

  1. The gcloud notebooks instances create function can take two post-startup script flags. To my knowledge, the startup-script-url metadata flag executes asynchronously during the boot process, and is not actually post startup. The separate post-startup-script flag is, as I understand it, a true post startup script, which executes after the instance has been created and booted. With the newer Instances, created using gcloud workbench instances create, the post-startup-script flag has been removed and we have only startup-script-url to work with. Other than this, I haven't found any documentation mentioning the post-startup options. Is there a replacement for the post-startup-script argument?
  2. Related to (1.), and provided my deduction is correct, that the startup-script-url is not actually run post startup, is there a way to run a true post startup script when building an Instance using gcloud workbench instances create? Here, I am interested in running a script as if I was the end user opening up the Jupyter Lab console. The paired boot-data VMs need to be fully built and accessible (as they would from the console), before running the script.
  3. When using a startup-script-url and following the instructions for option 2 here, is the executing user still jupyter?

Solution

    1. startup-script-url is specifically for GCE https://cloud.google.com/compute/docs/instances/startup-scripts/linux Since User Managed Notebooks and Workbench instances are GCE base, both automatically support this.

    The replacement is using metadata:

    --metadata=[KEY=VALUE,...]

    Reference: https://cloud.google.com/vertex-ai/docs/workbench/reference/rest/v2/projects.locations.instances#gcesetup

    In this case use for metadata key: post-startup-script and value your location (GCS or HTTPs path). Example: gs://test-bucket/PostStartupScript.sh

    1. Use metadata key: post-startup-script and value your location (GCS or HTTPs path). Example: gs://test-bucket/PostStartupScript.sh

    2. No, is root.

    "the startup script runs as root" https://cloud.google.com/compute/docs/instances/startup-scripts/linux

    Vertex post-startup-script also runs as root.

    Documentation here: https://cloud.google.com/vertex-ai/docs/workbench/instances/manage-metadata