webassemblyemscripten

Can I find the start address of "free" memory when using emscripten to compile to WebAssembly?


So, I have a C application that e.g., takes a string:

void EMSCRIPTEN_KEEPALIVE modify_entity(char* target) { ... }

In order to interact with this, I need to put the target somewhere into the WebAssembly memory. Essentially, my Embedder would create the string in the format that the C code expects (in this case, a null-terminated ASCII string) and put it into the WebAssembly memory, and then pass a pointer to that memory address into the function.

But: How do I know where in memory to put this?

If I look at the .wat file, I see that emscripten adds several data sections:

  ...
  (data $d10 (i32.const 1619) "\14")
  (data $d11 (i32.const 1631) "\17\00\00\00\00\17\00\00\00\00\09\14\00\00\00\00\00\14\00\00\14")
  (data $d12 (i32.const 1677) "\16")
  ...

My understanding is that these data segments exist in the same memory (the single (export "memory" (memory $memory))) since at the moment, WebAssembly implementations only support a single memory. So if I was to write into memory at address 1619, I'd overwrite the stuff from the data section, which would be bad.

Is that correct? And if that is correct, is there some way for an Embedder to find a valid memory address of free memory? I can manually check the .wat file and find the first free memory, but I wonder if there's a variable or function?

The only thing that looks somewhat promising is a global: (global $g0 (mut i32) (i32.const 67456)). But that global seems to be used throughout the generated WebAssembly code, so I'm not sure if I should use that as a free memory pointer.

Or would I have to somehow expose a malloc/free function from WebAssembly? I know there are quite a few implementations I could use, but I was hoping that the Embedder could handle all that itself, because all I need is to find some free memory that's safe to use. Is that possible, or is that a folly because I'd be competing with whatever internal memory allocation happens on the C side?


Solution

  • You have several options:

    1. As suggested already you could export and call the _malloc C function.
    2. Even easier you could call stringToNewUTF(my_js_string) which will return you a pointer to _malloc'd memory containing your string.
    3. Even better you could call stringToUTF8OnStack which avoids the allocator and returns a stack address pointing to the encoded string.
    4. Even even better you can use ccall or cwrap to do all this for you .e.g. ccall('modify_entity', null, ['string'],[my_js_string]); See https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#interacting-with-code-ccall-cwrap