c++stlvector

C++ vector that *doesn't* initialize its members?


I'm making a C++ wrapper for a piece of C code that returns a large array, and so I've tried to return the data in a vector<unsigned char>.

Now the problem is, the data is on the order of megabytes, and vector unnecessarily initializes its storage, which essentially turns out to cut down my speed by half.

How do I prevent this?

Or, if it's not possible -- is there some other STL container that would avoid such needless work? Or must I end up making my own container?

(Pre-C++11)

Note:

I'm passing the vector as my output buffer. I'm not copying the data from elsewhere.
It's something like:

vector<unsigned char> buf(size);   // Why initialize??
GetMyDataFromC(&buf[0], buf.size());

Solution

  • For default and value initialization of structs with user-provided default constructors which don't explicitly initialize anything, no initialization is performed on unsigned char members:

    struct uninitialized_char {
        unsigned char m;
        uninitialized_char() {}
    };
    
    // just to be safe
    static_assert(1 == sizeof(uninitialized_char), "");
    
    std::vector<uninitialized_char> v(4 * (1<<20));
    
    GetMyDataFromC(reinterpret_cast<unsigned char*>(&v[0]), v.size());
    

    I think this is even legal under the strict aliasing rules.

    When I compared the construction time for v vs. a vector<unsigned char> I got ~8 µs vs ~12 ms. More than 1000x faster. Compiler was clang 3.2 with libc++ and flags: -std=c++11 -Os -fcatch-undefined-behavior -ftrapv -pedantic -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-missing-prototypes

    C++11 has a helper for uninitialized storage, std::aligned_storage. Though it requires a compile time size.


    Here's an added example, to compare total usage (times in nanoseconds):

    VERSION=1 (vector<unsigned char>):

    clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=1 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out
    
    initialization+first use: 16,425,554
    array initialization: 12,228,039
    first use: 4,197,515
    second use: 4,404,043
    

    VERSION=2 (vector<uninitialized_char>):

    clang++ -std=c++14 -stdlib=libc++ main.cpp -DVERSION=2 -ftrapv -Weverything -Wno-c++98-compat -Wno-sign-conversion -Wno-sign-compare -Os && ./a.out
    
    initialization+first use: 7,523,216
    array initialization: 12,782
    first use: 7,510,434
    second use: 4,155,241
    


    #include <iostream>
    #include <chrono>
    #include <vector>
    
    struct uninitialized_char {
      unsigned char c;
      uninitialized_char() {}
    };
    
    void foo(unsigned char *c, int size) {
      for (int i = 0; i < size; ++i) {
        c[i] = '\0';
      }
    }
    
    int main() {
      auto start = std::chrono::steady_clock::now();
    
    #if VERSION==1
      using element_type = unsigned char;
    #elif VERSION==2
      using element_type = uninitialized_char;
    #endif
    
      std::vector<element_type> v(4 * (1<<20));
    
      auto end = std::chrono::steady_clock::now();
    
      foo(reinterpret_cast<unsigned char*>(v.data()), v.size());
    
      auto end2 = std::chrono::steady_clock::now();
    
      foo(reinterpret_cast<unsigned char*>(v.data()), v.size());
    
      auto end3 = std::chrono::steady_clock::now();
    
      std::cout.imbue(std::locale(""));
      std::cout << "initialization+first use: " << std::chrono::nanoseconds(end2-start).count() << '\n';
      std::cout << "array initialization: " << std::chrono::nanoseconds(end-start).count() << '\n';
      std::cout << "first use: " << std::chrono::nanoseconds(end2-end).count() << '\n';
      std::cout << "second use: " << std::chrono::nanoseconds(end3-end2).count() << '\n';
    }
    

    I'm using clang svn-3.6.0 r218006