airflowgoogle-cloud-dataprocdataproc

How to enable Spark web interface on Dataproc(GCP) using DataprocCreateClusterOperator of Apache Airflow


We are using Apache Airflow's DataprocCreateClusterOperator to create Spark cluster on GCP(Dataproc) and wanted to enable Spark Web UI interfaces. When creating using terminal we pass --enable-component-gateway in the create cluster command. How can we achieve this using DataprocCreateClusterOperator

We tried adding belowendpoint_config(sample code below), but no luck.

"software_config" : {
.....,

   "endpoint_config" : {
   "enable_http_port_access" : "true"
   }
}

Solution

  • According to the Dataproc REST API (Cluster, ClusterConfig, EndpointConfig), it should be

    {
      "clusterName": ...,
      ...
      "config": {
        "endpointConfig": {
          "enableHttpPortAccess" : true
        },
        "softwareConfig": {
          ...
        },
        ...
      }
    }