I've been using Stackdriver Logging for a long time and now I'd like to also take advantage of Error Reporting. I'd prefer to use Python's logging mechanism and scrape exceptions out of a log file if possible (for various reasons) rather than using the error_reporting library. That being said, the documentation is very confusing. For example the documentation says: https://cloud.google.com/error-reporting/docs/setup/compute-engine#log_exceptions
First, install the fluent-logger-python library:
sudo pip install google-cloud-error-reporting --upgrade
which leads me to believe that google-cloud-error-reporting is a fork of or related to fluent-logger-python however, when I initialize google-cloud-error-reporting it directly calls out to the GCE metadata server rather than connecting to the local fluentd. Are these two unrelated packages or is the documentation wrong or misleading? If I send JSON formatted exceptions to fluentd or to a log file monitored by fluentd will error reporting understand them?
Thanks for any clarifications
The documentation is wrong.
TL;DR You have to output something that looks like https://cloud.google.com/error-reporting/docs/formatting-error-messages
Here's my solution:
#Parse raw log entries to expose severity field so that
#StackDriver log viewer can properly categorize (and so we can filter)
<source>
@type tail
path /var/log/conductor
pos_file /var/log/td-agent/conductor.pos
format multiline
format_firstline /\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}/
format1 /^(?<message>(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}),\d*\s*[a-zA-Z_]*\s*(?<severity>[A-Z]*).*)/
read_from_head true
multiline_flush_interval 3s
tag conductor.app
</source>
#Add hostname field
<filter conductor.app>
@type record_transformer
<record>
hostname ${hostname}
</record>
</filter>
#Filter and tag log entries of severity ERROR or CRITICAL
<match conductor.app>
@type rewrite_tag_filter
rewriterule1 severity ERROR|CRITICAL conductor.err
rewriterule2 severity .+ conductor.info
</match>
#Process entries with tracebacks differently than those without
<match conductor.err>
@type rewrite_tag_filter
rewriterule1 message .*Traceback conductor.err.traceback
rewriterule2 message .+ conductor.err.message
</match>
#Parse out the traceback
<match conductor.err.traceback>
@type parser
key_name message
format multiline
format1 /^(?<message>(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}),\d*\s*(?<log>[a-zA-Z_]*)\s*(?<severity>[A-Z]*).*(?
<traceback>Traceback .*))/
tag conductor.err.traceback.report
</match>
#Format traceback reports
<filter conductor.err.traceback.report>
@type record_transformer
<record>
serviceContext {
"service": "${record[\"log\"]}"
}
message ${record["traceback"]}
</record>
remove_keys traceback
</filter>
#Process errors that don't have tracebacks
<match conductor.err.message>
@type parser
key_name message
format multiline
format1 /^(?<message>(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}),\d*\s*(?<log>[a-zA-Z_]*)\s*(?<severity>[A-Z]*):\s*(?<report>.*))/
tag conductor.err.message.report
</match>
#For errors without tracebacks we have to stub out some fields that
#error reporting requires, but we don't have
<filter conductor.err.message.report>
@type record_transformer
<record>
serviceContext {
"service": "${record[\"log\"]}"
}
message ${record["report"]}
reportLocation {
"filePath": "None",
"lineNumber": 0,
"functionName": "None"
}
</record>
</filter>
#Send to StackDriver logging!
<match conductor.**>
@type google_cloud
buffer_chunk_limit 2M
flush_interval 5s
max_retry_wait 300
disable_retry_limit
num_threads 8
</match>