cassandragocql

Cassandra debug log analysis


I have a cassandra debug.log. It has a lot of SELECT * queries that are not fired by any application. Applications request specific fields in SELECT queries, also the queries seem to have a LIMIT 5000 clause which I am pretty sure is not there in any application. Are these queries fired by cassandra internally? The debug log is filled with such queries. The application uses gocql driver to connect to cassandra.

<SELECT * FROM table_name WHERE id = 0 LIMIT 5000>, was slow 45 times: avg/min/max 4969/4925/4996 msec - slow timeout 500 msec/cross-node
DEBUG [ScheduledTasks:1] 2021-01-14 18:02:33,271 MonitoringTask.java:152 - 160 operations timed out in the last 5004 msecs:
<SELECT * FROM table_name WHERE id = abcd LIMIT 5000>, total time 7038 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = efgh LIMIT 5000>, total time 5793 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = hijk LIMIT 5000>, total time 5289 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = lmnop LIMIT 5000>, total time 5826 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = qrst LIMIT 5000>, total time 6006 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = uvwx LIMIT 5000>, total time 5905 msec, timeout 5000 msec/cross-node
<SELECT * FROM table_name WHERE id = yzabc LIMIT 5000>, total time 5217 msec, timeout 5000 msec/cross-node 
.
..
....
.....
... (110 were dropped)

Solution

  • All those queries are coming from your application. They are not done by Cassandra.

    Those messages from MonitoringTask are logged by a feature in Cassandra 3.10+ called slow query logging (CASSANDRA-12403). I've previously explained it in this post -- https://community.datastax.com/questions/7835/.

    The slow query logging aggregates queries which took longer than slow_query_log_timeout_in_ms (default is 500ms) into groups of 5-second "windows". As part of the aggregation, columns are not enumerated in the logging and are instead replaced with an asterisk (*) so they can be easily grouped.

    In addition, drivers have paging enabled. When your application does not set a page size, the drivers default to a page size of 5000 (LIMIT 5000). This is the limit which gets logged in the slow query message you posted. Cheers!