google-cloud-platformgoogle-cloud-data-fusiongoogle-cloud-dlp

Using GCP DLP with DataFusion, unable to find template


I have created a DLP Identification template named DLPTest in Project X.
My Datafusion resources are hosted in Project Y.
Issue is when I use the Redact plugin in Datafusion, and provide the template ID or path in the form -
projects/X/locations/{LOCATION}/inspectTemplates/DLPTest or
projects/X/inspectTemplates/DLPTest
All permissions have been provided to datafusion SA, compute engine SA, DLP Service Account. Datafusion fails to find the template, as it keeps searching for template in Project Y.
> Error logs - > Caused by:com.google.api.gax.rpc.InvalidArgumentException: io.grpc.StatusRuntimeException: INVALID_ARGUMENT: Invalid path:
Datafusion is expecting template in location projects/Y/inspectTemplates/projects/DLPTest
How do I enable DF to look for template in the correct location in separate project? Thanks.

Solution

  • When you want Project Y (where your data fusion is in) to use resources from Project X (where the DLP is in) is to add the data fusion and compute engine service accounts of Project Y to Project X.

    Notes:

    Project Y:

    1. Go to IAM & Admin -> IAM
    2. Click View by: "Members"
    3. Tick checkbox "Include Google-provided role grants"
    4. Look for service-(project number of Project Y)@gcp-sa-datafusion.iam.gserviceaccount.com and (project number of Project Y)-compute@developer.gserviceaccount.com
    5. Add role "DLP Administrator" for service-(project number of Project Y)@gcp-sa-datafusion.iam.gserviceaccount.com

    Project X:

    1. Go to IAM & Admin -> IAM
    2. Click Add
    3. Under New Members, put service-(project number of Project Y)@gcp-sa-datafusion.iam.gserviceaccount.com
    4. Grant role of "DLP Admininistrator"
    5. Repeat step 2 to step 4 but this time put in (project number of Project Y)-compute@developer.gserviceaccount.com

    Now that you are able to set the permissions, Go back to Project Y and update your Redact to point to Project X.

    1. Go to Data Fusion -> Studio
    2. Click Redact-> Properties
    3. Put the template ID you created in Project X, in my sample it is "test_template" enter image description here
    4. Under Project ID, put the Project ID of Project X enter image description here
    5. Run your Data Fusion pipeline