linuxperformanceext3

blktrace to measure the IO performance


This is the ouput of the blktrace. I could not understand what is "N 0 (00 ..) [multipathd]". I'm testing the write IO performance of the FS.

I have 2 doubts,

  1. N - is a action, but I dont find the usage of it in the blktrace.pdf.
  2. What is the difference between IOSTAT and BLKTRACE.

blktrace o/p:

  8,128  7       11    85.638053443  4009  I   N 0 (00 ..) [multipathd]   
  8,128  7       12    85.638054275  4009  D   N 0 (00 ..) [multipathd]   
  8,128  2       88    89.861199377  5210  A   W 384 + 8 <- (253,0) 384   
  8,128  2       89    89.861199876  5210  Q   W 384 + 8 [i_worker_0]   
  8,128  2       90    89.861202645  5210  G   W 384 + 8 [i_worker_0]   
  8,128  2       91    89.861204604  5210  P   N [i_worker_0]   
  8,128  2       92    89.861205587  5210  I  WA 384 + 8 [i_worker_0]   
  8,128  2       93    89.861210869  5210  D  WA 384 + 8 [i_worker_0]   
  8,128  2       94    89.861499857     0  C  WA 384 + 8 [0]   
  8,128  2       95    99.845910681  5230  A   W 384 + 8 <- (253,0) 384   
  8,128  2       96    99.845911148  5230  Q   W 384 + 8 [i_worker_20]   
  8,128  2       97    99.845913846  5230  G   W 384 + 8 [i_worker_20]   
  8,128  2       98    99.845915910  5230  P   N [i_worker_20]   
  8,128  2       99    99.845917081  5230  I  WA 384 + 8 [i_worker_20]   
  8,128  2      100    99.845922597  5230  D  WA 384 + 8 [i_worker_20]

Solution

  • There is introduction to blktrace http://duch.mimuw.edu.pl/~lichota/09-10/Optymalizacja-open-source/Materialy/10%20-%20Dysk/gelato_ICE06apr_blktrace_brunelle_hp.pdf

    difference between IOSTAT and BLKTRACE.

    Check slides 5 and 6:

    The iostat utility does provide information pertaining to request queues associated with  specifics devices

    • – Average I/O time on queue, number of merges, number of  blocks read/written, ...
    • – However, it does not provide detailed information  on a per­I/O basis

    Blktrace. Low-overhead, configurable kernel component which emits events for specific operations performed on each I/O entering the block I/O layer

    So, iostat is generic tool to output statistics; and blktrace is tool to capture and output more information about all I/O requests served in the time when tool was active.

    Slide 11 has some decoding intro

     8,128  7       11    85.638053443  4009  I   N 0 (00 ..) [multipathd]   
     maj/min cpu  seq#   timestamp_s.ns  pid  ACT RWBS blocks    process
    

    multipathd is kernel daemon, because its name is included into [] braces.

    The default format is described in the blktrace.pdf (here is source of the pdf: http://git.kernel.org/cgit/linux/kernel/git/axboe/blktrace.git/tree/doc/blktrace.tex)

    "%D %2c %8s %5T.%9t %5p %2a %3d "
    

    %D Displays the event's device major/minor as: \%3d,\%-3d. %2c CPU ID (2-character field). %8s Sequence number %5T.%9t 5-charcter field for the seconds portion of the time stamp and a 9-character field for the nanoseconds in the time stamp. %5p 5-character field for the process ID. %2a 2-character field for one of the actions. %3d 3-character field for the RWBS data.

    Actions

    C -- complete
    D -- issued
    I -- inserted
    Q -- queued
    B -- bounced
    M -- back merge
    F -- front merge
    G -- get request
    S -- sleep
    P -- plug
    U -- unplug
    T -- unplug due to timer
    X -- split
    A -- remap
    m -- message
    

    RWBS

    'R' - read,
    'W' - write 
    'D' - block discard operation
    'B' for barrier operation or 
    'S' for synchronous operation.
    

    So, for multipathd we have "I" action = "inserted" and N for RWBS, and the N is strange. There is no N in the doc and even in the source: blkparse_fmt.c - fill_rwbs(). Why? Because it is old doc and old source.

    In modern kernel, for example, 3.12 there is N in the fill_rwbs: http://sources.debian.net/src/linux/3.12.6-2/kernel/trace/blktrace.c?hl=1038#L1038

    if (t->action == BLK_TN_MESSAGE) {
        rwbs[i++] = 'N';
        goto out;
    }
    

    And the blktrace_api.h declares BLK_TN_MESSAGE as

    #define BLK_TN_MESSAGE        (__BLK_TN_MESSAGE | BLK_TC_ACT(BLK_TC_NOTIFY))
    
     * Trace categories
        BLK_TC_NOTIFY   = 1 << 10,  /* special message */
    
     * Notify events.
        __BLK_TN_MESSAGE,       /* Character string message */
    

    So, 'N' is notify action with string message. I think the message is seen instead of "blocks" field. I was able to find the patch which added the TN_MESSAGE, but there was no update of the documentation (just as planned in bazaar-model like linux) http://lkml.org/lkml/2009/3/27/31 "[PATCH v2 6/7] blktrace: print out BLK_TN_MESSAGE properly" 2009