c++stlcontainerslibrary-design

std::span as a base class for std::vector


I'm currently developing a custom C++ container library that is similar to std::vector, but I also want to have features of std::span baked in. In particular, I want to be able to write functions that take in a std::span-like parameter, and also work with a std::vector-like argument.

What I can do is to construct a class, say my_vector, and another class my_span that can be converted from the class my_vector. This is what the STL does, and I know it's usually a good idea to imitate the standard library. But I had this idea that my_span is basically a my_vector that does not own memory, and so it is possible to implement the two classes using inheritance. Here is what it looks like in code.

class my_vector;

class my_span {
        private:
        /* span sees [data_ + start_, data_ + stop_) */
        T* data_;
        size_t start_; 
        size_t stop_;
        friend class my_vector;
        public:
        /* Member functions operating on non-owning memory */
};

class my_vector : public my_span {
        private:
        size_t cap_;
        public:
        /* Member functions like resize, push_back, etc. */
};

Now my colleague is rejecting this idea based on the following reasons. To be fair, my representation of his objections might not be faithful.

  1. It is counter-intuitive that a span is defined before the actual container.
  2. Inheritance is used when the derived class is extended, but the class my_vector has the condition that its member start_ will always be 0. (There are reasons that force the pointer data_ to always point at the start of the allocated memory. This is why I can't just use a pointer and the length of the span.)

On the other hand, I believe this design has the following benefits.

  1. If you think about it, my_vector still "is a" my_span. It's just a my_span that owns memory and can change size.
  2. Every member function that operates on non-owning memory can be declared and implemented only once; the class my_vector automatically inherits it.
  3. To use my_vector as a my_span, you don't need to create a new my_span instance. Up-casting is much more natural than a constructor.

I haven't seen a design that follows this pattern, so I wanted to get more opinions on whether this is a good design.


Solution

  • The LSP states that a reference or pointer to a derived class should obey all of the invariants of a reference to a base class.

    This has to be every operation. This is harder than you think.

    Replacing a span's referenced buffer is a perfectly cromulant span operation. Doing so to the span parent component of a derived vector is toxic! In effect, you end up having to restrict what you can do to a span in order to make this work, resulting in either a crippled span type, or an unsafe combination.

    A better option here is probably to just have an implicit conversion from a vector to a span (but not the other way around, that should be explicit as it is expensive).

    On top of that, containers-of-data often consider the data to be part of them, while views-of-data don't. So getting begin/end iterators that mutate the contents of a span is const, while doing the same for a vector is not!

    template<class T>
    struct span {
      T* data = nullptr;
      std::size_t length = 0;
      T* begin() const { return data; }
      T* end() const { return data+length; }
    };
    template<class T>
    struct vector {
      T* data = nullptr;
      std::size_t length = 0;
      std::size_t capacity = 0;
      T const* begin() const { return data; }
      T const* end() const { return data+length; }
      T * begin() { return data; }
      T * end() { return data+length; }
    };
    

    another subtle difference.

    The rule I follow for span-likes (as in, array views) is that they are in charge of converting-from. They will convert from

    1. Raw C arrays.
    2. Initializer lists. (warning: somewhat dangerous)
    3. Any object with a .data() returning a pointer (to a compatible type), and a .size() returning an integral value. Note we are doing pointer arithmetic, so compatible is "identical up to const volatile".

    And they deduce their type from all of the above (using the template class deduction feature).

    Rule #3 catches std vector and std array and std string "for free".

    Rule #2 permits

    void foo( span<const flag> );
    foo( {flag::a, flag::b} );
    

    the danger of initializer list is:

    span<int> sp = {1,2,3};
    

    has a dangling reference in it.