pythongoogle-cloud-platformgoogle-apigoogle-cloud-webrisk

Using Google (Beta) Web Risk API


I am trying to use the google web risk API ( beta) with my python code . Please see the sample code:-

 URI='http://www.amazongroupco.org'  # bad url
 key='key=<mykey>'
 threat='&threatTypes=MALWARE'
 queryurl='https://webrisk.googleapis.com/v1beta1/uris:search?'
 requeststring=queryurl+key+threat
 header={"Content-Type":"application/json"}
 payload = {'uri':URI }

 try:

   req = requests.get(requeststring, headers=header, params=payload)
   print(req.url)

   if (req.status_code == 200):
     print(req)
   else:
      print(" ERROR:",req)

 except Exception as e:
        print(" Google API returned error:",e, req.url)

The above code always returns successful request status code "Response [200] OK" with an empty jason response {}. The fact that it is an malicious site , I was expecting it to return something in the jason response. I tried it with other malicious sites as well but I get the same response - empty jason object with a status 200 OK.
Am I missing something ?. I understand that some sites may not have malware but are social engineering sites which is another kind of threattype. Therefore i am wondering if there is an general purpose all-in-all threatTypes attribute I can use which will return a jason object no matter what the threat is as long as it is a Threat. Just a side note that anyone trying this should have an GCP account to generate a key. Any guidance here will be much appreciated.


Solution

  • I have also checked the Web Risk API and it works and I have also reproduced your issue and I get the same result. The URL you are checking it is not considered by Google as MALWARE threat. Honestly I have tried various types of threads for that specific URL and it seems that it is not in the Google lists.

    Here you can find a list of all the thread types you can use. There is a type for the situation you have described : THREAT_TYPE_UNSPECIFIED , but it returns a error json - invalid argument, always and this is intended behaviour.

    I should also note that as it is stated in the official documentation you should use the REST API with the URI encoded :

    The URL must be valid (see RFC 2396) but it doesn't need to be canonicalized.

    If you use the REST API, you must encode GET parameters, like the URI.