google-app-enginegoogle-cloud-platformstackdrivergoogle-cloud-stackdriver

Mysterious Stackdriver error seems to suggest a bug with Stackdriver itself


For around three weeks now, I see these errors in Google Cloud console, for a Golang Appengine app:


{"errorReference":"ixWACIGAwIxxxxxxxxMCJu5TG-ykYAjIKGICAwIm7lMb7KWiBgMCJu5TG-ykD", "incident_id":"0.mz2xxxxxgg", "summary":"An internal error occurred while invoking the webhook."}

There is no other information associated with the event, other than `resource.type = "stackdriver_notification_channel"`

How could I debug this issue further? It's problematic, because I keep getting emails about errors in my logs.

That phrase does not occur in my source code, and also yields zero results when doing a Google search.

On the date where the errors started appearing, I have made no changes that would affect logging in my code.

It really does seem like an internal stackdriver error.


Solution

  • Looks like the cause of the issue is an incorrectly configured URL for the webhook endpoint.

    Maybe your webhook endpoint is configured to an internal URL and Google Cloud is not able to resolve that URL, thus you receive the failure message.

    The webhook URL needs to be a publicly resolvable URL, not a private one, as it is described in Webhook notifications aren't received and Create and manage notification channels and it also needs to be accessible publicly. The "/" at the end depends on the application at the receiving end, it's not something Google Cloud controls, so we cannot know for sure.

    Check below what needs to be done to add to the URL to be publicly resolvable:

    This is a domain that Google Cloud does not control, so we do not know the specifics. Usually there needs to be at least an A entry in the authoritative domain name servers for db.com domain, that would resolve to an IP address. This is something your internal team should be able to help you with.