I'm having an issue with some missing logs from GKE container in Cluod Logging.
I have an Spring boot application deployed on GKE with Log4j2. All the logs generated by the application are always writted to Cloud Logging, so if I execute 100 transactions in parallel using Jmeter I can search for all the logs in Cloud logging without problems (Logs at the beginning, middle and end of the rest controller).
Now I am migrating from Log4j2 to Logback to have a full integration with Cloud Logging, I'm following this guide: https://docs.spring.io/spring-cloud-gcp/docs/current/reference/html/logging.html After the migration, updating only the log dependency from Log4j to Logback I can still see my logs on Cloud Logging but I'm having a weird issue with some missing logs.
For example if I send 30 parallel transactions using Jmeter I can see all the logs generated by the service, basically I'm searching for each message like this:
"This is a message "
"This is the mid of controller"
"End of trx, cleaning MDC context : "
Loggers looks like this:
Logger.info("Starting transaction: ", transactionId).
Logger.info("This is the mid of controller").
Logger.info("End of trx, cleaning MDC context : ", transactionId).
MDC.clear();
return response.
I'm searching for messages generated at the start of the rest controller, some logs at the middle of the controller and logs generated at the end of the controller, just before the "return reponse."
So if I send 30 trx in parallel using Jmeter I can find all the Loggers in Cloud Logging, but if I repeat the same 30 trx 1 min later I can find logs, but not all the logs. For example I can find:
30 of **Starting transaction:**,
22 of "This is the mid of controller"
2 of "End of trx, cleaning MDC context : "
Then if I repeat
20 of **Starting transaction:**,
0 of "This is the mid of controller"
0 of "End of trx, cleaning MDC context : "
If I wait 5 minutes and repeat
30 of **Starting transaction:**,
30 of "This is the mid of controller"
30 of "End of trx, cleaning MDC context : "
Even in some cases I can't literally find 0 logs for an specific transaction.
In all the cases the response of the service is always good, I mean even when I can't see all the logs I know the service is working fine because I can receive a 200 success and the expected response in the body. Also there are no inconsistencies in the response, everything is just working fine.
Sorry for the long intro but now the questions.
1 - Is Cloud Logging skipping similar logs? I'm always sending the same transaction in jmeter for all the cases, so the only difference between transactions is the transactionId (generated at the beginning of the rest controller)
2 - If I send a request manually using postman, I can find all the logs. Could Cloud Logging be skipping similar logs generated almost at the same time with parallel transactions?
I have tested the same cases on my local and everything is working fine, even if I send 100 transactions in parallel each second in a long loop I can find all the logs generated by the service (I'm wirtting the logs to a file), so I'm only having this issue in GKE.
Also I understand that @RestController is thread safe, so I'm not seeing inconsistencies in the logs or responses. I'm using MDC with the configuration in Logback includeMDC, basically I'm adding the transactionId to the MDC context MDC.put("transactionId", transactionId), if I'm not wrong MDC is also thread safe, so it should not be the problem.
My logback file looks like this.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<include resource="org/springframework/cloud/gcp/autoconfigure/logging/logback-appender.xml"/>
<include resource="org/springframework/boot/logging/logback/defaults.xml"/>
<include resource="org/springframework/boot/logging/logback/console-appender.xml"/>
<appender name="CONSOLE_JSON_APP" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
<layout class="org.springframework.cloud.gcp.logging.StackdriverJsonLayout">
<includeTraceId>true</includeTraceId>
<includeSpanId>true</includeSpanId>
<includeLevel>true</includeLevel>
<includeThreadName>true</includeThreadName>
<includeMDC>true</includeMDC>
<includeLoggerName>true</includeLoggerName>
<includeContextName>true</includeContextName>
<includeMessage>true</includeMessage>
<includeFormattedMessage>true</includeFormattedMessage>
<includeExceptionInMessage>true</includeExceptionInMessage>
<includeException>true</includeException>
<serviceContext>
<service>APP-LOG</service>
</serviceContext>
</layout>
</encoder>
</appender>
<appender name="CONSOLE_JSON_EXT" class="ch.qos.logback.core.ConsoleAppender">
<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
<layout class="org.springframework.cloud.gcp.logging.StackdriverJsonLayout">
<projectId>${projectId}</projectId>
<includeTraceId>true</includeTraceId>
<includeSpanId>true</includeSpanId>
<includeLevel>true</includeLevel>
<includeThreadName>true</includeThreadName>
<includeMDC>true</includeMDC>
<includeLoggerName>true</includeLoggerName>
<includeContextName>true</includeContextName>
<includeMessage>true</includeMessage>
<includeFormattedMessage>true</includeFormattedMessage>
<includeExceptionInMessage>true</includeExceptionInMessage>
<includeException>true</includeException>
<serviceContext>
<service>EXT-LOG</service>
</serviceContext>
</layout>
</encoder>
</appender>
<!-- Loggers-->
<root level="INFO" name="info-log">
<appender-ref ref="LOCAL_EXTERNAL_DEP"/>
</root>
<logger name="com.example.test.service" level="INFO" additivity="false">
<appender-ref ref="LOCAL_APP" />
</logger>
</configuration>
The restController looks like this.
@RestController
public class TestServiceController {
@PostMapping("/evaluate")
public Response evaluate(@RequestBody Request request) {
UUID transactionId = UUID.randomUUID();
Logger.info("Starting transaction: ", transactionId ).
MDC.put("transactionId", transactionId.toString());
//Some java code here (Only simple things)
Logger.info("This is the mid of controller").
//Some java code here (Only simple things)
Logger.info("End of trx, cleaning MDC context : ", transactionId).
MDC.clear();
return transaction.getResponse();
}
}
At this moment my only guess is that Cloud Logging is skipping similar logs generated in a short period of time (Basically parallels executions).
Try adjusting the flushing settings. For example, set flushLevel to DEBUG. Docs about flushLevel: https://docs.spring.io/spring-cloud-gcp/docs/current/reference/html/logging.html#_log_via_api
I've seen the issue you described when applications aren't configured to flush logs directly to stdout/stderr.