c++stringtemplatesstring-literalsuser-defined-literals

How to avoid duplication of string literal text in functions that accept different width characters / strings?


I frequently need to create string manipulation functions in C++. My APIs tend to be written to accept std::basic_string<T> (effectively), but I also want to accept std::basic_string_view<T> when it makes sense. There have been a few pains in adopting string_views which I've worked around, but with literals I have found only pain.

Now, being a good developer, I abhor both inline literals and code duplication.

I've found myself trying to write this:

template <typename CharType>
constexpr const CharType* HELLO_WORLD = "Hello World!";

This doesn't work, specifically since if CharType is not char I need a u or U or L in there.

I would love a solution that looks like this:

template <typename CharType>
static const CharType* HELLO_WORLD = <CharType>"Hello World!";

My current alternative require macros that create structs which specialize macros:

#define DEFINE_STRING_LITERAL(name, str)                 \
    template <typename CharType>                         \
    struct name {};                                      \
                                                         \
    template <>                                          \
    struct name<char> {                                  \
        static constexpr const char* value = str;        \
    };                                                   \
                                                         \
    template <>                                          \
    struct name<wchar_t> {                               \
        static constexpr const wchar_t* value = L##str;  \
    };                                                   \
                                                         \
    template <>                                          \
    struct name<char16_t> {                              \
        static constexpr const char16_t* value = u##str; \
    };                                                   \
                                                         \
    template <>                                          \
    struct name<char32_t> {                              \
        static constexpr const char32_t* value = U##str; \
    };

Which gets defined like this:

DEFINE_STRING_LITERAL(HELLO_WORLD ,"value");

and used like this:

HELLO_WORLD<CharType>::value

It's okay, and I can push the macro into a header somewhere and forget about it. But I don't like the definition syntax. It doesn't look like a definition, and then trying to find where it's defined with static tools if I need to change it is a headache. Also, it doesn't get type-deduced, which makes resulting code difficult to read.

The compiler knows which literal prefixes are associated with which types. I know this because it won't compile if I mess it up. So, is there any way to avoid that struct specialization and get something closer to the one-liner I'd love to see?

If anyone has a better solution that reads more like standard C++, instead of using a macro to define a type on the fly, that would be awesome.

P.S. I could probably create a constant class that has it's own literal syntax and stores a char*, but then I need to copy the strings instead of letting the compiler manage it. I'd also love if these were truly static literals, since I can't use dynamic const CharType* as values in template parameters.


Solution

  • First, define a compile-time string class template that initializes an array of each character type:

    template <std::size_t N>
    class ct {
      std::tuple<char[N], wchar_t[N], char8_t[N], char16_t[N], char32_t[N]> arrays;
    
     public:
      consteval ct(const char (&literal)[N]) {
        std::apply(
            [&](auto &...array) { (..., std::ranges::copy(literal, array)); },
            arrays);
      }
    
      ...
    };
    

    Next, define a method that returns a view of a given character type:

      template <class Char>
      consteval std::basic_string_view<Char> view() const {
        return {std::get<Char[N]>(arrays), N - 1};
      }
    

    To use it, just define a constant:

    inline constexpr ct test = "Jon Skeet";
    
    static_assert(test.view<char>() == "Jon Skeet"sv);
    static_assert(test.view<wchar_t>() == L"Jon Skeet"sv);
    static_assert(test.view<char8_t>() == u8"Jon Skeet"sv);
    static_assert(test.view<char16_t>() == u"Jon Skeet"sv);
    static_assert(test.view<char32_t>() == U"Jon Skeet"sv);
    

    Try it on Compiler Explorer