c++templatesc++20template-specialization

How to select a specialized function template at runtime


Consider this scenario (godbolt): I have a text buffer and a function that at runtime tells me its encoding:

enum class Enc {UTF8, UTF16LE, UTF16BE, UTF32LE, UTF32BE};
Enc detect_encoding_of(std::string_view buf);

Then I have a series of functions that can extract the codepoints from the text buffer, according to the encoding. I have organized them specializing a function template with the enum above:

template<Enc enc> char32_t extract_next_codepoint(const std::string_view buf, std::size_t& pos);
template<> char32_t extract_next_codepoint<Enc::UTF8>(const std::string_view buf, std::size_t& pos);
template<> char32_t extract_next_codepoint<Enc::UTF16LE>(const std::string_view buf, std::size_t& pos);

In order to parse the text buffer I have to select the proper function depending on the detected encoding:

const std::string_view buf; // filled at runtime
const Enc buf_enc = detect_encoding_of(buf);
std::size_t pos = 0;
switch( buf_enc )
   {
    case Enc::UTF8:
        // parse using extract_next_codepoint<Enc::UTF8>(buf,pos)
        break;

    case Enc::UTF16LE:
        // parse using extract_next_codepoint<Enc::UTF16LE>(buf,pos)
        break;

    // ...
    }

The functions extract_next_codepoint() are called a lot of time, that's why I'm avoiding runtime polymorphism for this. The downside of my current solution is that I have to write and maintain a lot of repeated and almost identical code for each of the supported encoding. Is there a way to write less and let the compiler give a little help?


Solution

  • You can use one of following solutions:

    1. Assign a function to function pointer and then call it. https://gcc.godbolt.org/z/Pffz97E4x

      char32_t (*next)(const std::string_view buf, std::size_t& pos);
      
      switch (buf_enc)
      {
          case Enc::UTF8:
              next = extract_next_codepoint<Enc::UTF8>;
              break;
      
          case Enc::UTF16LE:
              next = extract_next_codepoint<Enc::UTF16LE>;
              break;
      
         // ...
      }
      
      while (pos<buf.size())
      {
          const char32_t codepoint = next(buf, pos);
          fmt::print("{} at {} got {}\n", buf, pos, (int)codepoint);
      }
      
    2. Move your logic to separate template function that calls corresponding function. https://gcc.godbolt.org/z/nzGnoPnnW

      template <Enc enc> void handle(const std::string_view &buf)
      {
          for (std::size_t pos = 0; pos < buf.size(); )
          {
              const char32_t codepoint = extract_next_codepoint<enc>(buf, pos);
              fmt::print("{} at {} got {}\n", buf, pos, (int)codepoint);
          }  
      }
      
      ...
      
      switch (buf_enc)
      {
          case Enc::UTF8:
              handle<Enc::UTF8>(buf);
              break;
      
          case Enc::UTF16LE:
              handle<Enc::UTF16LE>(buf);
              break;
      
         // ...
      }
      

    For me seems like the second solution will work faster, but I haven't checked it and maybe they have the same speed.