macosshellarm64argvapple-silicon

Why does the path of the program I echo to the shell not match what I find in memory using debugger?


I am just starting to learn ARM assembly on my mac silicon M2. I wrote a program which just takes its command line arguments (aka argv) and prints them (and returns their number, argc) using the write system call.

The program works: It outputs the full path to the binary, using the exact path I called it.

But when I use lldb to examine the locations in memory from which I am convinced argv[0] is taken, it always contains the absolute path.

Is this because lldb always runs it using the absolute path? Is there a way to find out? If yes, is that what lddb should do or is it a bug?

Here is the source code for my program.

  1 // ARM assembly program on M2 for mac OS 14.7.1
  2 // print argv separated by newlines, return argc
  3 .global _start
  4 .p2align 2
  5 // input from OS:       W0 ... argc
  6 //                      X1 ... **char argv
  7 //                             argv[0] points to NULL separated concatenation
  8 //                             of elements of argv (for some reason)
  9 //
 10 // WORKING MEM: W19 argc
 11 //              X1 previous *argv for print
 12 //              X2 current str length
 13 //              W21 argc loop decr counter
 14 //              X22 *chr argv loop incr counter
 15 //              X23 *chr newline
 16 
 17 _start: 
 18         mov W19, W0             // W0 holds the number of args, copy
 19         adr X23, chr_newline    // make *"\n" available for printing
 20 // set up loop to print all arguments
 21         mov W21, W19       // put argc into loop counter
 22         ldr X22, [X1]      // X22 := *char argv[0]
 23 loop_argv:
 24         bl handle_arg      // print one argument
 25         sub W21, W21, #1   // decr loop counter
 26         cmp W21, #0        // loop if > 0
 27         b.gt loop_argv
 28 // exit
 29         mov  W0,   W19     // return code := argc
 30         mov  X16,  #1      // service code for termination
 31         svc  #0x80         // make sys call
 32 // local function handle_arg
 33 handle_arg:
 34         mov X1, X22        // save start *char in X1
 35         mov X2, #0         // X2 should contain len at end 
 36 count_chars_loop: // search for NULL char separating args
 37         ldrb W0, [X22], #1  // W0 = &X22, incr *char X22 after 
 38         cmp W0, #0          // check if prev X22 pointed to NULL char
 39         add X2, X2, #1      // incr len
 40         b.gt count_chars_loop
 41         sub X2, X2, #1      // correct for overcounting
 42         // X22 = *next argv now 
 43 //print argv[i] 
 44         mov X0, #1                // to stdout
 45         // *char next argv is already in X1
 46         // len(argv[i]) is already in X2
 47         mov  X16, #4              // nr for write call
 48         svc       #0x80           // make sys call
 49 // print newline
 50         mov X0, #1                // to stout
 51         mov X1, X23               // X1 = *char newline
 52         mov X2, #1                // len("\n")
 53         mov  X16, #4              // nr for write call
 54         svc       #0x80           // make sys call
 55         ret
 56 .align 2
 57 chr_newline: .ascii "\n"

I compile and link it using

as  get_args.s -o get_args.o
ld -o bin/get_args_min get_args_min.o -lSystem -syslibroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -e _start -arch arm64     

Here is what I see on the command line:

me@c get_args % ./bin/get_args_min test test
./bin/get_args_min
test
test
me@c get_args %

Note the relative path. (I tried calling it with the absolute path, too, and then I do get it on the terminal.) But the location we print from seems to always contain the full absolute path to the binary. To check this, I used

lldb -- ./bin/get_args_min test test

...then the lldb commands

b handle_args
r
re r

...then copied the address in X22, then

memory read [PASTE]

Solution

  • This is likely caused by the way lldb launches your program, namely using its absolute path, whereas the shell uses the path you specified (relative or absolute).

    When you start lldb, it shows the executable it will launch. Even if you don't add a directory prefix to the executable path, it sets the executable to its absolute path:

    $ lldb -- get_args_min foo bar
    (lldb) target create "get_args_min"
    Current executable set to '/tmp/get_args_min' (arm64).
    

    The way I understand it, from man execve, is that the value of argv[0] isn't standardised, it's up to the calling program (so the shell, or lldb) to set it.