google-cloud-platformssl-certificategoogle-cloud-dataprocgoogle-cloud-dataproc-serverless

Dataproc Serverless - how to set javax.net.ssl.trustStore property to fix java.security.cert.CertPathValidatorException


Trying to use google-cloud-dataproc-serveless with spark.jars.repositories option

gcloud beta dataproc batches submit pyspark sample.py --project=$GCP_PROJECT --region=$MY_REGION --properties \
spark.jars.repositories='https://my.repo.com:443/artifactory/my-maven-prod-group',\
spark.jars.packages='com.spark.mypackage:my-module-jar',spark.dataproc.driverEnv.javax.net.ssl.trustStore=.,\
spark.driver.extraJavaOptions='-Djavax.net.ssl.trustStore=. -Djavax.net.debug=true' \
--files=my-ca-bundle.crt

giving this exception

 javax.net.ssl.SSLHandshakeException: java.security.cert.CertPathValidatorException

Tried to set this property javax.net.ssl.trustStore using spark.dataproc.driverEnv/spark.driver.extraJavaOptions, but its not working.

Is it possible to fix this issue by setting the right config properties and values, or Custom Image is the ONLY solution, with pre installed certificates?


Solution

  • You need to have a Java trust store with your cert imported. Then submit the batch with

    --files=my-trust-store.jks \
    --properties spark.driver.extraJavaOptions='-Djavax.net.ssl.trustStore=./my-trust-store.jks',spark.executor.extraJavaOptions='-Djavax.net.ssl.trustStore=./my-trust-store.jks'