c++initializer-listuser-defined-literals

Using a C++ user-defined literal to initialise an array


I have a bunch of test vectors, presented in the form of hexadecimal strings:

MSG: 6BC1BEE22E409F96E93D7E117393172A
MAC: 070A16B46B4D4144F79BDD9DD04A287C
MSG: 6BC1BEE22E409F96E93D7E117393172AAE2D8A57
MAC: 7D85449EA6EA19C823A7BF78837DFADE

etc. I need to get these into a C++ program somehow, without too much editing required. There are various options:

But the one I ended up using was:

because fun. I defined a helper class HexByteArray and a user-defined literal operator HexByteArray operator "" _$ (const char* s) that parses a string of the form "0xXX...XX", where XX...XX is an even number of hex digits. HexByteArray includes conversion operators to const uint8_t* and std::vector<uint8_t>. So now I can write e.g.

struct {
  std::vector<uint8_t> MSG ;
  uint8_t* MAC ;
  } Test1 = {
  0x6BC1BEE22E409F96E93D7E117393172A_$,
  0x070A16B46B4D4144F79BDD9DD04A287C_$
  } ;

Which works nicely. But now here is my question: Can I do this for arrays as well? For instance:

uint8_t MAC[16] = 0x070A16B46B4D4144F79BDD9DD04A287C_$ ;

or even

uint8_t MAC[] = 0x070A16B46B4D4144F79BDD9DD04A287C_$ ;

I can't see how to make this work. To initialise an array, I would seem to need an std::initializer_list. But as far as I can tell, only the compiler can instantiate such a thing. Any ideas?


Here is my code:

HexByteArray.h

#include <cstdint>
#include <vector>

class HexByteArray
  {
public:
  HexByteArray (const char* s) ;
  ~HexByteArray() { delete[] a ; }

  operator const uint8_t*() && { const uint8_t* t = a ; a = 0 ; return t ; }
  operator std::vector<uint8_t>() &&
    {
    std::vector<uint8_t> v ( a, a + len ) ;
    a = 0 ;
    return v ;
    }

  class ErrorInvalidPrefix { } ;
  class ErrorHexDigit { } ;
  class ErrorOddLength { } ;

private:
  const uint8_t* a = 0 ;
  size_t len ;
  } ;

inline HexByteArray operator "" _$ (const char* s)
  {
  return HexByteArray (s) ;
  }

HexByteArray.cpp

#include "HexByteArray.h"

#include <cctype>
#include <cstring>

HexByteArray::HexByteArray (const char* s)
  {
  if (s[0] != '0' || toupper (s[1]) != 'X') throw ErrorInvalidPrefix() ;
  s += 2 ;

  // Special case: 0x0_$ is an empty array (because 0x_$ is invalid C++ syntax)
  if (!strcmp (s, "0"))
    {
    a = nullptr ; len = 0 ;
    }
  else
    {
    for (len = 0 ; s[len] ; len++) if (!isxdigit (s[len])) throw ErrorHexDigit() ;
    if (len & 1) throw ErrorOddLength() ;
    len /= 2 ;
    uint8_t* t = new uint8_t[len] ;
    for (size_t i = 0 ; i < len ; i++, s += 2)
      sscanf (s, "%2hhx", &t[i]) ;
    a = t ;
    }
  }

Solution

  • Use a numeric literal operator template, with the signature:

    template <char...>
    result_type operator "" _x();
    

    Also, since the data is known at compile-time, we might as well make everything constexpr. Note that we use std::array instead of C-style arrays:

    #include <cstdint>
    #include <array>
    #include <vector>
    
    // Constexpr hex parsing algorithm follows:
    struct InvalidHexDigit {};
    struct InvalidPrefix {};
    struct OddLength {};
    
    constexpr std::uint8_t hex_value(char c)
    {
        if ('0' <= c && c <= '9') return c - '0';
        // This assumes ASCII:
        if ('A' <= c && c <= 'F') return c - 'A' + 10;
        if ('a' <= c && c <= 'f') return c - 'a' + 10;
        // In constexpr-land, this is a compile-time error if execution reaches it:
        // The weird `if (c == c)` is to work around gcc 8.2 erroring out here even though
        // execution doesn't reach it.
        if (c == c) throw InvalidHexDigit{};
    }
    
    constexpr std::uint8_t parse_single(char a, char b)
    {
        return (hex_value(a) << 4) | hex_value(b);
    }
    
    template <typename Iter, typename Out>
    constexpr auto parse_hex(Iter begin, Iter end, Out out)
    {
        if (end - begin <= 2) throw InvalidPrefix{};
        if (begin[0] != '0' || begin[1] != 'x') throw InvalidPrefix{};
        if ((end - begin) % 2 != 0) throw OddLength{};
    
        begin += 2;
    
        while (begin != end)
        {
            *out = parse_single(*begin, *(begin + 1));
            begin += 2;
            ++out;
        }
    
        return out;
    }
    
    // Make this a template to defer evaluation until later        
    template <char... cs>
    struct HexByteArray {
        static constexpr auto to_array()
        {
            constexpr std::array<char, sizeof...(cs)> data{cs...};
    
            std::array<std::uint8_t, (sizeof...(cs) / 2 - 1)> result{};
    
            parse_hex(data.begin(), data.end(), result.begin());
    
            return result;
        }
    
        constexpr operator std::array<std::uint8_t, (sizeof...(cs) / 2)>() const 
        {
            return to_array();
        }
    
        operator std::vector<std::uint8_t>() const
        {
            constexpr auto tmp = to_array();
    
            return std::vector<std::uint8_t>{tmp.begin(), tmp.end()};
        }
    };
    
    template <char... cs>
    constexpr auto operator"" _$()
    {
        static_assert(sizeof...(cs) % 2 == 0, "Must be an even number of chars");
        return HexByteArray<cs...>{};
    }
    

    Demo

    Example usage:

    auto data_array = 0x6BC1BEE22E409F96E93D7E117393172A_$ .to_array();
    std::vector<std::uint8_t> data_vector = 0x6BC1BEE22E409F96E93D7E117393172A_$;
    

    As a side note, $ in an identifier is actually a gcc extension, so it's non-standard C++. Consider using a UDL other than _$.