javagoogle-cloud-platformapache-beampermission-deniedgoogle-secret-manager

Dataflow Runner has problems with GCP Secret Manager


I have an Apache Beam Dataflow project coded using Java where I am using the following sub routine to get database credentials:

private static JsonObject getCredentials(String suffix) {
    String secretName = "projects/example-project/secrets/example_" + suffix + "/versions/latest";
    try (SecretManagerServiceClient client = SecretManagerServiceClient.create()) {
        SecretPayload payload = client.accessSecretVersion(secretName).getPayload();
        String secret_text = payload.getData().toStringUtf8();
        return JsonParser.parseString(secret_text).getAsJsonObject();
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

This works in my dev machine, but when I compile and run on GCP using Dataflow Runner, I get the following error:

Error message from worker: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: com.google.api.gax.rpc.PermissionDeniedException: io.grpc.StatusRuntimeException: PERMISSION_DENIED: Permission 'secretmanager.versions.access' denied for resource 'projects/example-project/secrets/example_server1/versions/latest' (or it may not exist).

Since the code works in my dev machine, I know the secret exists. I am using the same service account credentials in both my dev machine and while triggering the dataflow job. Just for good measure, I even made it a Secret Manager Admin in IAM.

Any idea how I may go about narrowing down the issue and get my dataflow job working?


Solution

  • I had created a service account to get access to bigquery and other GCP resources by some Python scripts and Java Apache Beam / Dataflow. I had also set the service account credentials json as my dev machines default application credentials.

    I assumed that permission to Secret Manager needs to be given to this service account. However, while the dataflow job was triggered by this service account, the actual running of the job / workers was by another service account.

    I filtered my service accounts and found 'Default compute service account'. Setting Secret Manager Secret Accessor role to this service account fixed the issue for me, and my job runs fine now.