c++linuxgarbage-collectiong++boehm-gc

C++ standard library and Boehm garbage collector


I want to develop a multi-threaded C++ application (where eventually most of the C++ code would become generated by the application itself, which could be viewed as a high-level domain specific language) on Linux/AMD64/Debian with GCC 4.6 (and probably latest C++11 standard).

I really want to use Boehm's conservative garbage collector for all my heap allocations, because I want to allocate with new(GC) and never bother about delete. I am assuming that Boehm's GC is working well enough.

The main motivation for using C++ (instead of C) is all the algorithms and collections std::map ... std::vector provided by the C++ standard library.

Boehm's GC provide a gc_allocator<T> template (in its file gc/gc_allocator.h).

Should I redefine operator ::new as Boehm's one?

Or should I use all the collection templates with an explicit allocator template argument set to some gc_allocator? I don't understand exactly the role of the second template argument (the allocator) to std::vector? Is it used to allocate the vector internal data, or to allocate each individual element?

And what about std::string-s? How to make their data GC-allocated? Should I have my own string, using basic_string template with gc_allocator? Is there some way to get the internal array-s of char allocated with GC_malloc_atomic not GC_malloc ?

Or do you advise not using Boehm GC with an application compiled by g++ ?

Regards.


Solution

  • To answer partly my own question, the following code

    // file myvec.cc
    #include <gc/gc.h>
    #include <gc/gc_cpp.h>
    #include <gc/gc_allocator.h>
    #include <vector>
    
    class Myvec {
      std::vector<int,gc_allocator<int> > _vec;
    public:
      Myvec(size_t sz=0) : _vec(sz) {};
      Myvec(const Myvec& v) : _vec(v._vec) {};
      const Myvec& operator=(const Myvec &rhs) 
        { if (this != &rhs) _vec = rhs._vec; return *this; };
      void resize (size_t sz=0) { _vec.resize(sz); };
      int& operator [] (size_t ix) { return _vec[ix];};
      const int& operator [] (size_t ix) const { return _vec[ix]; };
      ~Myvec () {};
    };
    
    extern "C" Myvec* myvec_make(size_t sz=0) { return new(GC) Myvec(sz); }
    extern "C" void myvec_resize(Myvec*vec, size_t sz) { vec->resize(sz); }
    extern "C" int myvec_get(Myvec*vec, size_t ix) { return (*vec)[ix]; }
    extern "C" void myvec_put(Myvec*vec, size_t ix, int v) { (*vec)[ix] = v; }
    

    when compiled with g++ -O3 -Wall -c myvec.cc produces an object file with

     % nm -C myvec.o
                     U GC_free
                     U GC_malloc
                     U GC_malloc_atomic
                     U _Unwind_Resume
    0000000000000000 W std::vector<int, gc_allocator<int> >::_M_fill_insert(__gnu_cxx::__normal_iterator<int*, std::vector<int, gc_allocator<int> > >, unsigned long, int const&)
                     U std::__throw_length_error(char const*)
                     U __gxx_personality_v0
                     U memmove
    00000000000000b0 T myvec_get
    0000000000000000 T myvec_make
    00000000000000c0 T myvec_put
    00000000000000d0 T myvec_resize
    

    So there is no plain malloc or ::operator new in the generated code.

    So by using gc_allocator and new(GC) I apparently can be sure that plain ::opertor new or malloc is not used without my knowledge, and I don't need to redefine ::operator new


    addenda (january 2017)

    For future reference (thanks to Sergey Zubkov for mentioning it on Quora in a comment), see also n2670 and <memory> and garbage collection support (like std::declare_reachable, std::declare_no_pointers, std::pointer_safety etc...). However, that has not been implemented (except in the trivial but acceptable way of making it a no-op) in current GCC or Clang at least.