I am trying to find out details about suspicious traffic on my website which is running on Google Cloud (Google App Engine with Java, to be more specific). One idea is to analyze which IP addresses are sending requests very often. In SQL I would do something like
SELECT
protoPayload.ip,
COUNT(protoPayload.ip) AS `ip_occurrence`
FROM
foo /* TODO replace foo with correct table name */
WHERE
protoPayload.ip NOT LIKE '66.249.77.%' /* ignore Google bots */
GROUP BY
protoPayload.ip
ORDER BY
`ip_occurrence` DESC
LIMIT 100
But I have no idea how to do this with Logs Explorer. “Log Analytics” seems to allow such SQL, but requires to use it only on non-production projects.
I also tried to download the logs from Logs Explorer, but there is a limit of 10,000 logs, which is not enough at all.
Is there any easy way?
On the bigger picture, I am trying to get my AdSense account reopened. So far I failed. Maybe the proof I provided, my Google Analytics data, is not strong enough. The field description on the form mentions IP addresses. But in Google Analytics I don't see any IP addresses ...
Log Explorer allows you to create some easy Log Explorer queries for filtering but you won't have any Group By
possibility there.
To achieve something similar you can use Sink:
Sinks control how Cloud Logging routes logs. Using sinks, you can route some or all of your logs to supported destinations. Some of the reasons that you might want to control how your logs are routed include the following:
- To store logs that are unlikely to be read but that must be retained for compliance purposes.
- To organize your logs in buckets in a format that is useful to you.
- To use big-data analysis tools on your logs.
- To stream your logs to other applications, other repositories, or third parties.
Cloud Storage: JSON files stored in Cloud Storage buckets.
Pub/Sub: JSON messages delivered to Pub/Sub topics. Supports third-party integrations, such as Splunk, with Logging.
BigQuery: Tables created in BigQuery datasets.
Another Cloud Logging bucket: Log entries held in Cloud Logging log buckets.
For your scenario best would be BigQuery Sink
In the documentation you have a step by step guide on how to Create Sink.
Useful links: