vimxxd

Vim: calling xxd with system command in substitution results in conversion error


Background is that I have a log file that contains hex dumps that I want to convert with xxd to get that nice ASCII column that shows possible strings in the binary data.

The log file format looks like this:

My interesting hex dump:
00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78
00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c
00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20
00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69
00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20
00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30
00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65
00 73 00 2e

Visually selecting the hex dump and do xxd -r -p followed by a xxd -g1 on the result does exactly what I'm aiming for. However, since the number of dumps I want to convert are quite a few I would rather automate the process. So I'm using the following substitute command to do the conversion:

:%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',system('xxd -r -p',submatch(0)))

The expression matches the entire hex dump in the log file. The match is sent to xxd -r -p as stdin and its output is used as stdin for xxd -g1. Well, that's the idea at least.

The thing is that the above almost works. It produces the following result:

My interesting hex dump:
00000000: 01 53 01 6f 01 6d 01 65 01 20 01 74 01 65 01 78  .S.o.m.e. .t.e.x
00000010: 01 74 01 20 01 65 01 78 01 61 01 6d 01 70 01 6c  .t. .e.x.a.m.p.l
00000020: 01 65 01 20 01 75 01 73 01 69 01 6e 01 67 01 20  .e. .u.s.i.n.g. 
00000030: 01 55 01 54 01 46 01 2d 01 31 01 36 01 20 01 69  .U.T.F.-.1.6. .i
00000040: 01 6e 01 20 01 6f 01 72 01 64 01 65 01 72 01 20  .n. .o.r.d.e.r. 
00000050: 01 74 01 6f 01 20 01 67 01 65 01 74 01 20 01 30  .t.o. .g.e.t. .0
00000060: 01 78 01 30 01 30 01 20 01 62 01 79 01 74 01 65  .x.0.0. .b.y.t.e
00000070: 01 73 01 2e                                      .s..

All 00 bytes have mysteriously transformed into 01. It should have produced the following:

My interesting hex dump:
00000000: 00 53 00 6f 00 6d 00 65 00 20 00 74 00 65 00 78  .S.o.m.e. .t.e.x
00000010: 00 74 00 20 00 65 00 78 00 61 00 6d 00 70 00 6c  .t. .e.x.a.m.p.l
00000020: 00 65 00 20 00 75 00 73 00 69 00 6e 00 67 00 20  .e. .u.s.i.n.g. 
00000030: 00 55 00 54 00 46 00 2d 00 31 00 36 00 20 00 69  .U.T.F.-.1.6. .i
00000040: 00 6e 00 20 00 6f 00 72 00 64 00 65 00 72 00 20  .n. .o.r.d.e.r. 
00000050: 00 74 00 6f 00 20 00 67 00 65 00 74 00 20 00 30  .t.o. .g.e.t. .0
00000060: 00 78 00 30 00 30 00 20 00 62 00 79 00 74 00 65  .x.0.0. .b.y.t.e
00000070: 00 73 00 2e                                      .s..

What am I not getting here?

Of course I can use macros and other ways of doing this, but I want to understand why my substitution command doesn't do what I expect.

Edit:

For anyone that want to achieve the same thing I provide the substitution expression that works on an entire file. The expression above was only for testing purposes using the log file example also from above. The one below is the one that performs a correct conversion, modified based on the information Kent provided in his answer.

:%s/\(\(\x\{2\} \)\{16\}\_.\)\+/\=system('xxd -p -r | xxd -g1',submatch(0))

Solution

  • very likely, the problem is string conversion in the system() The input will be converted into a string by vim, so does the output of your first xxd command.

    You can try to extract that hex parts into a file. then:

    xxd -r -p theFile|vim -
    

    And then calling the system('xxd -g1', alltext), you are gonna get something else than 00 too.

    This doesn't work in the same way of a pipe (xxd ...|xxd...). But unfortunately, the system() function doesn't accept pipes.

    If you want to fix your :s command, you need to call systemlist() on your first xxd call to get the data in binary format, then pass it to the 2nd xxd:

    :%s/\(\x\{2\} \?\)\{16\}\_.*/\=system('xxd -g1',systemlist('xxd -r -p',submatch(0)))
    

    The cmd above will generate the 00s. since there is no string conversion.

    However, when working with some data format other than plain string, perhaps we can use filters instead of calling system(). It would be a lot eaiser. For your example:

    2,$!xxd -r -p|xxd -g1