c++templates c++20 template-specialization

How to select a specialized function template at runtime

Consider this scenario (godbolt): I have a text buffer and a function that at runtime tells me its encoding:

enum class Enc {UTF8, UTF16LE, UTF16BE, UTF32LE, UTF32BE};
Enc detect_encoding_of(std::string_view buf);

Then I have a series of functions that can extract the codepoints from the text buffer, according to the encoding. I have organized them specializing a function template with the enum above:

template<Enc enc> char32_t extract_next_codepoint(const std::string_view buf, std::size_t& pos);
template<> char32_t extract_next_codepoint<Enc::UTF8>(const std::string_view buf, std::size_t& pos);
template<> char32_t extract_next_codepoint<Enc::UTF16LE>(const std::string_view buf, std::size_t& pos);

In order to parse the text buffer I have to select the proper function depending on the detected encoding:

const std::string_view buf; // filled at runtime
const Enc buf_enc = detect_encoding_of(buf);
std::size_t pos = 0;
switch( buf_enc )
   {
    case Enc::UTF8:
        // parse using extract_next_codepoint<Enc::UTF8>(buf,pos)
        break;

    case Enc::UTF16LE:
        // parse using extract_next_codepoint<Enc::UTF16LE>(buf,pos)
        break;

    // ...
    }

The functions extract_next_codepoint() are called a lot of time, that's why I'm avoiding runtime polymorphism for this. The downside of my current solution is that I have to write and maintain a lot of repeated and almost identical code for each of the supported encoding. Is there a way to write less and let the compiler give a little help?

Solution

You can use one of following solutions:

Assign a function to function pointer and then call it. https://gcc.godbolt.org/z/Pffz97E4x

char32_t (*next)(const std::string_view buf, std::size_t& pos);

switch (buf_enc)
{
    case Enc::UTF8:
        next = extract_next_codepoint<Enc::UTF8>;
        break;

    case Enc::UTF16LE:
        next = extract_next_codepoint<Enc::UTF16LE>;
        break;

   // ...
}

while (pos<buf.size())
{
    const char32_t codepoint = next(buf, pos);
    fmt::print("{} at {} got {}\n", buf, pos, (int)codepoint);
}

Move your logic to separate template function that calls corresponding function. https://gcc.godbolt.org/z/nzGnoPnnW

template <Enc enc> void handle(const std::string_view &buf)
{
    for (std::size_t pos = 0; pos < buf.size(); )
    {
        const char32_t codepoint = extract_next_codepoint<enc>(buf, pos);
        fmt::print("{} at {} got {}\n", buf, pos, (int)codepoint);
    }  
}

...

switch (buf_enc)
{
    case Enc::UTF8:
        handle<Enc::UTF8>(buf);
        break;

    case Enc::UTF16LE:
        handle<Enc::UTF16LE>(buf);
        break;

   // ...
}

For me seems like the second solution will work faster, but I haven't checked it and maybe they have the same speed.