google-cloud-platformlogbackgoogle-cloud-loggingspring-logback

Logging to the console can be slow


Google has some recommendations about logging in their blog page here:

Logging recommendations for containerized applications

Before we dive into some typical use cases for logging in GKE, let's first review some best practices for using Cloud Logging with containerized applications:

Use the native logging mechanisms of containers to write the logs to stdout and stderr. Log directly with structured logging with different fields. You can then search your logs more effectively based on those fields.

On the other hand, Logback has a warning here:

Logging to the console can be slow

Writing to the console can be up to a thousand times slower than writing to a file. Therefore, we strongly recommend avoiding console logging in production environments, especially in high-volume systems where performance is critical.

Additionally, some application servers or frameworks automatically capture console's output and redirect it to a logging backend such as logback-classic. If this captured data is written to the console, for instance using ConsoleAppender, race conditions can occur, potentially leading to application deadlocks.

Question: Aren't they conflicting with each other? How they relate and where do we need to write our logs? Are "console" and "stdout" are same? If so, one recommends using it, other not.

We are currently using ch.qos.logback.core.ConsoleAppender with StackdriverJsonLayout. As seen in ConsoleAppender source.

We see following logs:

INFO in ch.qos.logback.core.ConsoleAppender[JSON] - BEWARE: Writing to the console can be very slow. Avoid logging to the 
INFO in ch.qos.logback.core.ConsoleAppender[JSON] - console in production environments, especially in high volume systems.
INFO in ch.qos.logback.core.ConsoleAppender[JSON] - See also https://logback.qos.ch/codes.html#slowConsole

Solution

  • I found following answer from a colleague

    If your application is a high volume production system and generates logs exceeding several hundred KiB/s you should consider to switch to another method.

    For example Google Kubernetes Engine (GKE) provides default log throughput of at least 100 KiB/s per node, which can scale up to 10 MiB/s on underutilized nodes with sufficient CPU resources. However, at higher throughputs, there is a risk of log loss.

    Check your log throughput in the Metrics explorer and based on that you can roughly have a recommendation:

    Log Throughput Recommended Approach
    < 100 KiB/s per node Console logging (ConsoleAppender)
    100 KiB/s – 500 KiB/s Buffered/asynchronous file-based logging
    \> 500 KiB/s Direct API integration or optimized agents