google-cloud-dataflowapache-beamspotify-scio

Strange Google Dataflow job log entries


Recently my jobs logs in a job details view are full of entries such as:

  "Worker configuration: [machine-type] in [zone]."

Jobs themselves seem to work fine, but these entries didn't show up before and I am worried I won't be able to spot meaningful log entries because of it.

Is it something I should be worried about? Do you know how to get rid of them? enter image description here


Solution

  • Yes, those logs are spammy and are not to be worried about. I have submitted an internal bug to reduce these spammy logs (with this being the first). While it is being fixed, you can familiarize yourself with the Stackdriver Logs Exclusion feature. This allows you to create filters to exclude logs based on a user-defined query.

    Here are the steps to exclude specific Datawflow logs:

    1. Navigate to the logs ingestion page
    2. Find the "Dataflow Step" row
    3. Click the right-most button on the same row
    4. Select the "Create Exclusion Filter..." option from the drop-down
    5. Write the query to select which logs you want to exclude (in your case: resource.type="dataflow_step" "Worker configuration")
    6. Name your filter
    7. Select your percent of logs to exclude (exclude 100% of selected logs is the default)
    8. Click the "Create Exclusion" button
    9. You can view your created exclusion filter in the "Exclusions" tab in the logs ingestion page