I have been working on a project, and a part of it involves receiving a link from the user end. Currently, I am using a very rudimentary mechanism in terms of verifying the user input (all it does is check whether the input is a url or not). In an attempt to improve the security, I have been googling around to find solutions. Some of the solutions I came across involved checking the characters in the link, checking the links length, etc. I also came across some websites that have auto updating databases with known malicious websites. However, going through and comparing thousands of links would be very inefficient, and the first solution felt a bit spotty.
After a little more searching, I came across Google's Lookup API:
https://cloud.google.com/web-risk/docs/lookup-api#python
The gist of it is that it compares inputted URLs with various Web Risk lists.
But, I also came across the following thread:
https://www.reddit.com/r/cybersecurity/comments/192llqy/automating_the_detection_of_malicious_urls/
The thread suggests to use a paid intelligence service over anything free, or self made.
So, my questions are as follows:
Is there a reason to use the paid services over google's? (I realize that Google limits the number of requests but for now assume that I am not dealing with a huge amount of requests)
Can I myself make something myself that checks URLs with updated databases in a quick and easy way? (The impression I got from the thread above was that this is not possible)
What services should I use if I opt against google (I saw some listed in the thread above, however, those services would check more for sites that already have a fair bit of traffic. My project deals with some users who have sites with little to no traffic, which could result in a false malicious tag)
The thread posted above mirrors my situation quite similarly, in that I have no cybersecurity experience and I too am accessing a lot of the sites metadata.
Any advice would be much appreciated!
When checking if URLs are safe in our project, it's important to decide between using existing services or making our own solution.
Example of How to Use Google Safe Browsing and VirusTotal
import requests
from google.auth.transport.requests import Request
from google.oauth2 import service_account
# Initialize Google Safe Browsing API
def check_with_google_safe_browsing(api_key, url):
api_url = f'https://safebrowsing.googleapis.com/v4/threatMatches:find?key={api_key}'
body = {
'client': {
'clientId': "ourcompany",
'clientVersion': "1.0"
},
'threatInfo': {
'threatTypes': ['MALWARE', 'SOCIAL_ENGINEERING'],
'platformTypes': ['WINDOWS'],
'threatEntryTypes': ['URL'],
'threatEntries': [{'url': url}]
}
}
response = requests.post(api_url, json=body)
return response.json()
# Initialize VirusTotal API
def check_with_virustotal(api_key, url):
headers = {
"x-apikey": api_key
}
params = {
'url': url
}
response = requests.get("https://www.virustotal.com/api/v3/urls", headers=headers, params=params)
return response.json()
# Main function to check URLs
def check_url_safety(url):
google_api_key = 'OUR_GOOGLE_SAFE_BROWSING_API_KEY'
virustotal_api_key = 'OUR_VIRUSTOTAL_API_KEY'
# Check with Google Safe Browsing
google_result = check_with_google_safe_browsing(google_api_key, url)
if google_result.get('matches'):
print("URL detected as malicious by Google Safe Browsing")
return False
# Check with VirusTotal
virustotal_result = check_with_virustotal(virustotal_api_key, url)
if virustotal_result.get('data').get('attributes').get('last_analysis_stats').get('malicious') > 0:
print("URL detected as malicious by VirusTotal")
return False
print("URL is safe")
return True
# Example usage
url_to_check = 'http://example.com'
is_safe = check_url_safety(url_to_check)
By using a mix of services, we can make our URL verification system stronger. First, use Google Safe Browsing for the basics, then add in VirusTotal or other services for a more thorough look. Paid options give we even better protection, but make sure it's worth the money for what we need in our project.