c++constexprstring-concatenationstring-viewchar-pointer

How to join compile-time string-like objects while keeping the API simple?


I am trying to concatenate string-like objects at compile-time. With the help of this post, I came up with something like this:

#include <cstddef>
#include <utility>
#include <algorithm>
#include <array>
#include <string_view>

template <std::size_t N>
class CharArray
{
  private:
    std::array<char, N> _string;

    template <std::size_t S>
    friend class CharArray;

    template <std::size_t S1, std::size_t S2>
    constexpr CharArray(const std::array<char, S1> s1, const std::array<char, S2> s2)
        : _string() {
        std::copy(s1.begin(), s1.end() - 1, _string.begin());
        std::copy(s2.begin(), s2.end() - 1, _string.begin() + S1 - 1);
    }

  public:
    constexpr CharArray(const char (&str)[N]) : _string() {
        std::copy(&str[0], &str[0] + N, _string.begin());
    }

    constexpr CharArray(const std::array<char, N>& str) : _string() {
        std::copy(std::cbegin(str), std::cend(str), _string.begin());
    }

    template <std::size_t S>
    constexpr auto operator+(const CharArray<S> other) const {
        return CharArray<N + other._string.size() - 1>(_string, other._string);
    }

    [[nodiscard]]
    constexpr auto c_str() const {
        return _string.data();
    }
};

template <std::size_t N, typename... Strings>
constexpr auto join_chars(const char (&first)[N], Strings&&... rest) {
    if constexpr (!sizeof...(Strings)) { return CharArray<N>{first}; }
    else {
        return CharArray<N>{first} + join_chars(std::forward<Strings>(rest)...);
    }
}

#include <iostream>

int main() {
    // this works;
    constexpr const char name[] = "Edward";
    constexpr auto joined1 = join_chars("name=", name);
    std::cout << joined1.c_str() << std::endl;

    // this does not work:
    // constexpr std::string_view value = "42"; // essentially same as constexpr const char*
    // constexpr auto joined2 = join_chars("value=", value);
    // std::cout << joined2.c_str() << std::endl;

    return 0;
}

However, this works only for string literals and char arrays. Is there a way to extend the functionality for other compile-time string-like objects?

EDIT: As suggested by @Oersted, one way to achieve this is by adding these two static functions to the CharArray class:

template <std::size_t N>
class CharArray
{
  //...
  public:
    static constexpr CharArray create(const char* c_ptr) {
        std::array<char, N> tmp_array;
        std::copy(c_ptr, c_ptr + N, tmp_array.begin());
        CharArray<N> char_array{tmp_array};
        return char_array;
    }

    template <typename T>
    requires std::is_constructible_v<std::string_view, T>
    static constexpr CharArray create(const T& str) {
        return CharArray::create(str.data());
    }
};

And then adding a macro and an overload:

template <typename T>
requires std::is_constructible_v<std::string_view, T>
constexpr std::size_t constexpr_strlen(const T& c_ptr) {
    return std::string_view{c_ptr}.length();
}

#define ConstexprChars(str) CharArray<constexpr_strlen(str) + 1>::create(str)

template <std::size_t N, typename... Strings>
constexpr auto join_chars(const CharArray<N>& first, Strings&&... rest) {
    if constexpr (!sizeof...(Strings)) { return first; }
    else { return first + join_chars(std::forward<Strings>(rest)...); }
}

Then, one could do:

int main() {
    constexpr const char* value = "42";
    constexpr auto joined2 = join_chars("value=", ConstexprChars(value));
    std::cout << joined2.c_str() << std::endl;

    return 0;
}

However, this changes API. Is it possible to achieve this, while retaining the same API? That is, is it possible to have this:

int main() {
    constexpr const char* value = "42";
    constexpr auto joined2 = join_chars("value=", value);
    std::cout << joined2.c_str() << std::endl;

    return 0;
}

EDIT 2: I have found a video of Jason Turner dealing with a similar problem (but he also added a static storage). He dealt with it by using a lambda as a constexpr function parameter (which is apparently allowed). So instead of having a class like mine CharArray, he used that lambda trick. This does not make the API any better since you have to write lambda wrapped around your char* so I guess that is the best one can get for now.


Solution

  • As written by @NathanOliver in a comment under your question, what you ask is just not possible.¹

    The example of desired code,

    int main() {
        constexpr const char* value = "42";
        constexpr auto joined2 = cconcat("value=", value);
        std::cout << joined2.c_str() << std::endl;
    
        return 0;
    }
    

    is highly misleading, because it circumvents the whole problem by putting the callee, hence the arguments passed to the caller, and the caller in the same translation unit, thus making the compiler aware of things that it would otherwise not know.

    Indeed, the only reason why the instantiation of cconcat succeeds, is that value is constexpr (and "value=" is a string literal), so Ns... can be all deduced by the compiler.

    A simpler example is this

    #include <cstddef>
    template <std::size_t N>
    constexpr auto foo(char const (&s)[N]) {
        return N;
    }
    int fun() {
        constexpr const char name[] = "Edward";
        static_assert(foo(name) == 7);
        return foo(name);
    }
    

    which compiles down to

    fun():
            mov     eax, 7
            ret
    

    But as soon as you make the input string come from another translation unit, like in this case,

    #include <cstddef>
    template <std::size_t N>
    constexpr auto foo(char const (&s)[N]) {
        return N;
    }
    char const* getstring(); // defined in other TU
    int fun() {
        foo(getstring()); // the foo above is a non-viable candidate
        return 0;
    }
    

    then you can't even compile, irrespective of whether the callee returns a compile time string (or even a string literal) via that char const*.

    Clarification 1

    I'm not saying that the direct cause of the code above not compiling is that getstring is defined in another TU.

    The direct cause, as you point out in a comment, is clearly that getstring returns char const*, a by-pointer C-style string of unknown length, whereas foo accepts char const(&)[N], a reference to a C-style string of length required to be know at compile time (indeed N is determined by template type deduction).

    But getstring is returning char const* precisely because it's defined in a TU that is mean to be linked against. The only way for another TU to return a C-style string, in a way that you can link it to a TU that calls foo(getstring());, is to have getstring return a C-style string by reference, which implies that it returns a string of know size! This, for instance, compiles

    #include <cstddef>
    template <std::size_t N>
    constexpr auto foo(const char (&)[N]) {
        return N;
    }
    
    char const (&getstring())[5]; // defined in another TU
    
    int fun() {
        foo(getstring());
        return 0;
    }
    

    but it is of little to no interest, imho, because it means that getstring can return only strings of a known length, 5 in the example.

    Clarification 2

    Since you seem to be positive about what I called the non-interesting case of caller and callee in the same TU, let me clarify what I meant by quoting myself:

    it circumvents the whole problem by putting the callee, hence the arguments passed to the caller, and the caller in the same translation unit

    Here I simplified a bit, by ascribing to "caller" both the call site and the "producer" of the strings that the caller passes to the callee. After all, you wrote these two lines together:

        constexpr const char* value = "42";
        constexpr auto joined2 = cconcat("value=", value);
    

    The only way for cconcat to concatenate strings at compile time, is that those strings have to be know at compile time! How would you possible concatenate at compile time strings that will only be known at run time?

    This, one way or another, means that you have all the strings in the same TU were cconcat is called and defined. They surely don't come from a call to char const*-returing function defined who knows where.

    Furthermore, you can't let the constexpr-ness be lost across function boundaries. And remember, what Devid Stone presented is not a thing at the moment, so it doesn't matter how much you and I know that char const* s = "hello";i is written in the same place where cconcat is defined and called like cconcat("a literal", s): s length is not known at compile-time inside of cconcat. End of the story.

    So you're back to your original solution, and that's it!

    Clarification 3

    As much as constexpr char [const]* is constexpr, the length of that string is not part of its type. E.g.

    constexpr char const* s1{"hello"};
    constexpr char const* s2{"hello world"};
    static_assert(std::is_same_v<decltype(s1), decltype(s2)>);
    

    compiles.

    If a property of a value is not encoded in its type, the constexpr-ness of that value will be lost when that value is passed to a function as an argument, so, even in one TU,

    consteval auto f(char const* s) {
      // the size of s is not known here!
    }
    int main() {
      constexpr char const* s{"hello"};
      foo(s);
    }
    

    Furthermore, given s1 and s2 have the same type as shown above, you should understand you're out of luck even if you try to pass the C-style string as a NTTP; as in, if you define

    template<char const* s>
    consteval auto foo() {
        int c{};
        while (s[c] != '\0') { c++; }
        return c;
    };
    

    the following will not compile!

    int main() {
      constexpr char const* s{"hello"};
      foo<s>();
    }
    

    Notice that this compiles³:

    constexpr char const s[]{"hello"};
    int main() {
      foo<s>();
    }
    

    but the type of s is char const[6], so the size is part of the type! Indeed

    constexpr char const s1[]{"hello"};
    constexpr char const s2[]{"hello world"};
    static_assert(std::is_same_v<decltype(s1), decltype(s2)>);
    

    does not compile!


    (¹) If you're truly asking for a solution to a problem where caller and callee are in the same TU, then the discussion is moot, imho, and the solution lies, in the worst case scenario, in a bit of template metaprogramming you can find for instance here. And whatever trick you'd need, you'd need it because C++ still has a long way to go; watch this talk by David Stone.

    But as far as solving the problem in the real-world case of caller and callee in different TUs, there's no solution to your usecase, at least not made possible by the compiler², because by the time caller and callee come actually in contact, the compiler (actualy compilers, because two different ones could have been used for caller and callee) has long been sitting after finishing its job.

    (²) Assuming only 2 TUs, you can imagine a very smart linker that would inspect the caller code's object file to work out what the lengths of all the strings passed to cconcat are, and then... basically modify the object code of the callee... which would essentially mean recompile it, I believe...

    (³) Incidentally, I'm not sure why for the above to compile s is required to be static (so you need to write static if you move its definition inside the function scope, i.e. main in this case.