I'm trying to configure trino-jvm properties while creating a Dataproc cluster. I'm following Google's documentation and am able to successfully create a cluster without any special JVM configuration, but am receiving an error when attempting to configure JVM properties.
Here's the CLI command that running:
gcloud dataproc clusters create test-dataproc-cluster \
--project=MY_PROJECT \
--optional-components=TRINO \
--region=region \
--enable-component-gateway \
--region=us-central1 \
--image-version=2.1 \
--properties="trino-jvm:XX:+HeapDumpOnOutOfMemoryError"
Here's the error that I receive:
ERROR: (gcloud.dataproc.clusters.create) argument --properties: Bad syntax for dict arg: [trino-jvm:XX:+HeapDumpOnOutOfMemoryError]. Please see `gcloud topic flags-file` or `gcloud topic escaping` for information on providing list or dictionary flag values with special characters.
It looks like Dataproc expects the value to the --properties
argument to be in dictionary form, i.e. --properties=TYPE:KEY=VALUE
. I'm able to successfully configure other properties that have a Key/Value syntax. However, I'm unable to configure JVM properties that do not follow that Key/Value form.
How can I configure trino-jvm properties in Dataproc?
You can use this --properties
flag command flag to specify the Trino JVM property trino.jvm-extras=-XX:+HeapDumpOnOutOfMemoryError. The property is provided in the format key=value.
--properties trino-env-config=trino.jvm-extras=-XX:+HeapDumpOnOutOfMemoryError
Aside from --properties flag, one workaround to fix the error it by using the --metadata flag
because it can accept multiple key-value pairs. It can provide the necessary JVM properties for Trino in Dataproc clusters during cluster creation.
--metadata trino-jvm=XX:+HeapDumpOnOutOfMemoryError