ats

Implementing getenv() in ATS


In (at least) Linux, argv and envp have the same type: they are pointers to a NULL-terminated array of pointers to NUL-terminated strings, and their storage persists for the lifetime of the process, with no need to manage it. envp's strings are of the form "KEY=VALUE", so getenv("KEY") can iterate over envp's strings, compare their leading bytes against "KEY", and then return a pointer to the string starting after the '=', when found.

So, lookup can involve no memory allocation at all, but lookup is slower than what you'd get with a hash table.

When you're given an envp, the OS guarantees validity of the type, but it says nothing about how big the array is.

I'd like to write a getenv() that works with this type as it's really laid out in memory, and I'd like to return pointers into its strings, with minimal allocation.

I'd expect something like this to work:

fun getenv_from(key: string, env: magic): [b:bool] option_vt(string, b) =
    if string2ptr(ptr[0]) > the_null_pointer then (
            case+ string_is_prefix(string_append(key, "="), env[0]) of
            | true => Some_vt(string_make_suffix(env[0], string_length(key)+1))
            | false => getenv_from(key, succ(env))
    ) else None_vt()

where magic should just be envp's ptr plus, I guess, some praxi proof assertions that tell ATS what I've told you about this type.

I've come up with one workaround, which is to pretend that envp really does have argv's type:

#include "share/atspre_staload.hats"

extern fun fake_free{n:int}: argv(n) -> void = "mac#"

%{
void fake_free(void *arg) {}
%}

// print out the first environment variable, as an example
implement main{n:int}(argc, argv, envp) =
    let
        val env2 = $UNSAFE.castvwtp1{argv(n)}(envp)
    in
        if argc > 0 then
            print_string(env2[0]);
        print_newline();
        fake_free(env2);
        0
    end

This works, but now you also have to defeat false constraints involving argc (which has nothing in reality to do with envp), as well as deal with the linear typing, which is unnecessary.

Looking around ATS's library, I think parray_v might be useful, but there are no example uses of parray_v anywhere that I've found.

Note: as a purely practical matter, using C's own getenv() is easy enough from ATS code, as is using the various getenv_gc, getenv_opt, getenv_exn' in ATS's own libraries. So the problem is not "how can I get an environment variable in ATS program?", but really "how can I *properly implement getenv()* in ATS". I'm given a bare pointer. How do I tell ATS about its properties, so that I can work with it in a legal manner? I also don't want to write C-in-ATS with$UNSAFE.ptr_xxx` calls; I don't want to throw away the ATS advantage of provably safe memory access even with raw pointers.


Solution

  • It is a bit involved to use parray_v to implement getenv. Here it is:

    https://github.com/githwxi/ATS-Postiats/blob/master/doc/EXAMPLE/TESTATS/getenv.dats