mysqlmysql-slow-query-logpt-query-digest

pt-query-digest is coming up with no results and "0 total, 1 unique". Any idea how to solve that?


It appears for some of our logs, the percona toolkit pt-query-digest tool works fine, but for others we get the following output:

# Files: /.../mysqld_slow.log
# Overall: 0 total, 1 unique, 0 QPS, 0x concurrency ______________________
# Attribute          total     min     max     avg     95%  stddev  median
# ============     ======= ======= ======= ======= ======= ======= =======
# Query size        18.19M  18.19M  18.19M  18.19M  18.19M       0  18.19M

# Profile
# Rank       Query ID   Response time Calls      R/Call     Apdx V/M   Ite
# ========== ========== ========== ========== ========== ==== ===== ======
$

Anyone know what could be wrong with my log file? it appears to be valid, having the following first 10 lines:

Sep 28 00:00:37 gcdb-master mysqld_slow_log: SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 576 LIMIT 1;
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # Query_time: 0.041188  Lock_time: 0.000046 Rows_sent: 1  Rows_examined: 46418
Sep 28 00:00:37 gcdb-master mysqld_slow_log: SET timestamp=1348790434;
Sep 28 00:00:37 gcdb-master mysqld_slow_log: SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 286358 LIMIT 1;
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # Query_time: 0.030769  Lock_time: 0.000050 Rows_sent: 1  Rows_examined: 46583
Sep 28 00:00:37 gcdb-master mysqld_slow_log: SET timestamp=1348790434;
Sep 28 00:00:37 gcdb-master mysqld_slow_log: SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 286679 LIMIT 1;
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
Sep 28 00:00:37 gcdb-master mysqld_slow_log: # Query_time: 0.594351  Lock_time: 0.000038 Rows_sent: 12  Rows_examined: 342673

Solution

  • I have run some tests with your sample output, and I suspect that your file is not valid. This file, obtained from yours by cutting the syslog-like portion of each line, and supplemented with the two missing #description-lines before the first query, appears to work

    # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
    # Query_time: 0.041188  Lock_time: 0.000046 Rows_sent: 1  Rows_examined: 46418
    SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 576 LIMIT 1;
    # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
    # Query_time: 0.041188  Lock_time: 0.000046 Rows_sent: 1  Rows_examined: 46418
    SET timestamp=1348790434;
    SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 286358 LIMIT 1;
    # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
    # Query_time: 0.030769  Lock_time: 0.000050 Rows_sent: 1  Rows_examined: 46583
    SET timestamp=1348790434;
    SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 286679 LIMIT 1;
    # User@Host: db_one[db_one] @ ip-127.0.0.1.ec2.internal [127.0.0.1]
    # Query_time: 0.594351  Lock_time: 0.000038 Rows_sent: 12  Rows_examined: 342673
    

    as does the same file, where I have removed the first line, so as to start with a #description line, and output:

    # 240ms user time, 20ms system time, 24.59M rss, 87.74M vsz
    # Current date: Fri Nov  2 22:03:02 2012
    # Hostname: mintaka
    # Files: orig.log
    # Overall: 3 total, 1 unique, 0 QPS, 0x concurrency ______________________
    # Attribute          total     min     max     avg     95%  stddev  median
    # ============     ======= ======= ======= ======= ======= ======= =======
    # Exec time          113ms    31ms    41ms    38ms    40ms     5ms    40ms
    # Lock time          142us    46us    50us    47us    49us     2us    44us
    # Rows sent              3       1       1       1       1       0       1
    # Rows examine     136.15k  45.33k  45.49k  45.38k  44.45k    0.00  44.45k
    # Query size           234      76      79      78   76.28    1.50   76.28
    
    # Profile
    # Rank Query ID           Response time Calls R/Call Apdx V/M   Item
    # ==== ================== ============= ===== ====== ==== ===== ==========
    #    1 0x0C756AF10BC44B0D 0.1131 100.0%     3 0.0377 1.00  0.00 SELECT companies
    
    # Query 1: 0 QPS, 0x concurrency, ID 0x0C756AF10BC44B0D at byte 226 ______
    # This item is included in the report because it matches --limit.
    # Scores: Apdex = 1.00 [1.0]*, V/M = 0.00
    # Query_time sparkline: |    ^   |
    # Attribute    pct   total     min     max     avg     95%  stddev  median
    # ============ === ======= ======= ======= ======= ======= ======= =======
    # Count        100       3
    # Exec time    100   113ms    31ms    41ms    38ms    40ms     5ms    40ms
    # Lock time    100   142us    46us    50us    47us    49us     2us    44us
    # Rows sent    100       3       1       1       1       1       0       1
    # Rows examine 100 136.15k  45.33k  45.49k  45.38k  44.45k    0.00  44.45k
    # Query size   100     234      76      79      78   76.28    1.50   76.28
    # String:
    # Hosts        ip-127.0.0.1.ec2.internal
    # Users        db_one
    # Query_time distribution
    #   1us
    #  10us
    # 100us
    #   1ms
    #  10ms  ################################################################
    # 100ms
    #    1s
    #  10s+
    # Tables
    #    SHOW TABLE STATUS LIKE 'companies'\G
    #    SHOW CREATE TABLE `companies`\G
    # EXPLAIN /*!50100 PARTITIONS*/
    SELECT  `companies`.* FROM `companies`  WHERE `companies`.`id` = 286358 LIMIT 1\G
    

    So I guess that the problem is connected with log file format and possibly rotation (e.g. the file is truncated so that the initial #description is lost).

    Also, I've looked a bit into the Percona utility code. The default parser (slowlog) searches for a #Time line to get timestamp, which I find in my slow-log, but is not present in yours. This does not influence reading per se, but it might skew the results.

    It also uses ";\n#" as input record separator, so the syslog format is a definite "no" for the default --type slowlog. The others also do not appear to work with syslog+slowlog format.

    I've tried changing the input record separator and adding a hack to remove the syslog part of each line, and the result appears to work, but I fear it does not because the results on a known slowlog aren't consistent.

    I am afraid the simplest thing to do is to chop off the beginning of the file, then cut off the beginning of each line, before feeding it to the utility:

    sed -e '/.*: #/,$b' -e 'd' < slow.log \
    | cut -d' ' -f6- \
    | pt-query-digest -