I'm able to run inspection against text content without any issues, but am never able to get any positive findings against images. I'm running code that's almost exactly like the example code here: https://cloud.google.com/dlp/docs/inspecting-images#dlp-inspect-file-ruby.
My code:
parent = "projects/xxxxxxxx/locations/global"
inspect_config = {
info_types: [{name: "CREDIT_CARD_NUMBER"}],
min_likelihood: :POSSIBLE,
limits: { max_findings_per_request: 0 },
include_quote: true
}
dlp = Google::Cloud::Dlp.dlp_service
dlp.inspect_content(parent: parent,
inspect_config: inspect_config,
item: {
byte_item: {
type: :BYTES_TYPE_UNSPECIFIED,
data: File.open(credit_card.png, 'rb').read
}
})
I consistently get 0 findings in my results:
<Google::Cloud::Dlp::V2::InspectContentResponse: result: <Google::Cloud::Dlp::V2::InspectResult: findings: [], findings_truncated: false>>
I've tried multiple images of credit cards, including one I pulled from a blog post where the author had a positive finding from it (example #2).
I've also tried using the EMAIL_ADDRESS InfoType and pictures with email addresses in them with the same results. Some of the docs reference sending in a Base64 version of the image and I tried wrapping Base64.encode64(File.open(abs_filename, 'rb').read)
with no luck as well. Any help or ideas here appreciated!
The credit card examples you gave fail checksum checks (I tried typing them in on https://cloud.google.com/dlp/demo/#!/ to verify they don't match.) Did the original test work on a real card?
To verify the image is making its way through and this isn't an inspection quality bug, can you try calling RedactImageRequest instead, with redact_all_text set to true? If we see that the OCR step is working, we can narrow in on which part is having the problem.
It should look like this if that part is working