macosexecutableelfbinary-datamach

Meaning of a Common String In Executables?


There appear to be some similar-looking long alphanumeric strings that commonly occur in Mach-O 64 bit executables and ELF 64-bit LSB executables among other symbols that are not alphanumeric:

cat /bin/bash | grep -c "AWAVAUATSH"

has 181 results, and

cat /usr/bin/gzip | grep -c "AWAVAUATSH"

has 9 results.

enter image description here

What are these strings?


Solution

  • Interesting question. Since I didn't know the answer, here are the steps I took to figure it out:

    Where in the file does the string occur?

    strings -otx /bin/gzip | grep AWAVAUATUSH
       35e0 AWAVAUATUSH
       69a0 AWAVAUATUSH
       7920 AWAVAUATUSH
       8900 AWAVAUATUSH
       92a0 AWAVAUATUSH
    

    Which section is that in?

    readelf -WS /bin/gzip
    
    There are 28 section headers, starting at offset 0x16860:
    
    Section Headers:
      [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
      [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
      [ 1] .interp           PROGBITS        0000000000400238 000238 00001c 00   A  0   0  1
      [ 2] .note.ABI-tag     NOTE            0000000000400254 000254 000020 00   A  0   0  4
      [ 3] .note.gnu.build-id NOTE            0000000000400274 000274 000024 00   A  0   0  4
      [ 4] .gnu.hash         GNU_HASH        0000000000400298 000298 000038 00   A  5   0  8
      [ 5] .dynsym           DYNSYM          00000000004002d0 0002d0 000870 18   A  6   1  8
      [ 6] .dynstr           STRTAB          0000000000400b40 000b40 000360 00   A  0   0  1
      [ 7] .gnu.version      VERSYM          0000000000400ea0 000ea0 0000b4 02   A  5   0  2
      [ 8] .gnu.version_r    VERNEED         0000000000400f58 000f58 000080 00   A  6   1  8
      [ 9] .rela.dyn         RELA            0000000000400fd8 000fd8 000090 18   A  5   0  8
      [10] .rela.plt         RELA            0000000000401068 001068 0007e0 18   A  5  12  8
      [11] .init             PROGBITS        0000000000401848 001848 00001a 00  AX  0   0  4
      [12] .plt              PROGBITS        0000000000401870 001870 000550 10  AX  0   0 16
      [13] .text             PROGBITS        0000000000401dc0 001dc0 00f1ba 00  AX  0   0 16
      [14] .fini             PROGBITS        0000000000410f7c 010f7c 000009 00  AX  0   0  4
    ... etc.
    

    From above output, we see that all instances of AWAVAUATUSH are in .text section (which covers [0x1dc0, 0x10f7a) offsets of the file.

    Since this is .text, we expect to find executable instructions there. The address we are interested in is 0x401dc0 (.text address) + 0x35e0 (offset of AWAVAUATUSH in the file) - 0x1dc0 (offset of .text in the file) == 0x4035e0.

    First, let's check that the above arithmetic is correct:

    gdb -q /bin/gzip
    
    (gdb) x/s 0x4035e0
    0x4035e0:       "AWAVAUATUSH\203\354HdH\213\004%("
    

    Yes, it is. Next, what are the instructions there?

    (gdb) x/20i 0x4035e0
       0x4035e0:    push   %r15
       0x4035e2:    push   %r14
       0x4035e4:    push   %r13
       0x4035e6:    push   %r12
       0x4035e8:    push   %rbp
       0x4035e9:    push   %rbx
       0x4035ea:    sub    $0x48,%rsp
       0x4035ee:    mov    %fs:0x28,%rax
       0x4035f7:    mov    %rax,0x38(%rsp)
       0x4035fc:    xor    %eax,%eax
       0x4035fe:    mov    0x213363(%rip),%rax        # 0x616968
       0x403605:    mov    %rdi,(%rsp)
       0x403609:    mov    %rax,0x212cf0(%rip)        # 0x616300
       0x403610:    cmpb   $0x7a,(%rax)
       0x403613:    je     0x403730
       0x403619:    mov    $0x616300,%ebx
       0x40361e:    mov    (%rsp),%rdi
       0x403622:    callq  0x4019f0 <strlen@plt>
       0x403627:    cmp    $0x20,%eax
       0x40362a:    mov    %rax,0x8(%rsp)
    

    These indeed look like normal executable instructions. What is the opcode of push %r15? This table shows that 0x41, 0x57 is indeed push %r15, and these opcodes just happen to spell AW in ASCII. Similarly, push %r14 is encoded as 0x41, 0x56, which just happens spell AV. Etc.

    P.S. My version of gzip is fully stripped, which is why GDB shows no symbols in the above disassembly. If I use a non-stripped version instead, I see:

    strings -o -tx gzip | grep AWAVAUATUSH | head -1
       6be0 AWAVAUATUSH
    
    readelf -WS gzip | grep text
      [13] .text             PROGBITS        0000000000401b00 001b00 00d102 00  AX  0   0 16
    

    So the string is still in .text.

    gdb -q ./gzip
    (gdb) p/a 0x0000000000401b00 + 0x6be0 - 0x001b00
    $1 = 0x406be0 <inflate_dynamic>
    
    (gdb) disas/r 0x406be0
    Dump of assembler code for function inflate_dynamic:
       0x0000000000406be0 <+0>:     41 57   push   %r15
       0x0000000000406be2 <+2>:     41 56   push   %r14
       0x0000000000406be4 <+4>:     41 55   push   %r13
       0x0000000000406be6 <+6>:     41 54   push   %r12
       0x0000000000406be8 <+8>:     55      push   %rbp
       0x0000000000406be9 <+9>:     53      push   %rbx
       0x0000000000406bea <+10>:    48 81 ec 38 05 00 00    sub    $0x538,%rsp
    ...
    

    Now you can clearly see the ASCII 0x4157415641554154... sequence of opcodes.

    P.P.S. The original question asks about AWAVAUATSH, which does appear in my Mach-O bash and gzip, but not in Linux ones. Conversely, AWAVAUATUSH does not appear in my Mach-O binaries.

    The answer is however the same. The AWAVAUATSH sequence is the same as AWAVAUATUSH, but with push %rbp omitted.

    P.P.P.S Here are some other "fun" strings of the same nature:

    strings /bin/bash | grep '^A.A.A.' | sort | uniq -c | sort -nr | head
         44 AWAVAUATUSH
         27 AVAUATUSH
         16 AWAVAUA
         15 AVAUATUH
         14 AWAVAUI
         14 AWAVAUATUH
         12 AWAVAUATI
          8 AWAVAUE1
          8 AVAUATI
          6 AWAVAUATU