I am using Boost Spirit X3 to create a programming language, but when I try to support Unicode, I get an error!
Here is an example of a simplified version of that program.
#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
struct sample : x3::symbols<unsigned> {
sample()
{
add("48", 10);
}
};
int main()
{
const std::string s("🌸");
boost::u8_to_u32_iterator<std::string::const_iterator> first{cbegin(s)},
last{cend(s)};
x3::parse(first, last, sample{});
}
What should I do?
As you noticed, internally char_encoding::unicode
employs char32_t
.
So, first changing the symbols
accordingly:
template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;
struct sample : symbols<unsigned> {
sample() { add(U"48", 10); }
};
Now the code fails calling into case_compare
:
/home/sehe/custom/boost_1_78_0/boost/spirit/home/x3/string/detail/tst.hpp|74 col 33| error: no match for call to ‘(boost::spirit::x3::case_compare<boost::spirit::char_encoding::unicode>) (reference, char32_t&)’
As you can see it expects a char32_t
reference, but u8_to_u32_iterator
returns unsigned int
s (std::uint32_t
).
Just for comparison / sanity check: https://godbolt.org/z/1zozxq96W
Luckily you can instruct the u8_to_u32_iterator
to use another co-domain type:
#define BOOST_SPIRIT_X3_UNICODE
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
template <typename T>
using symbols = x3::symbols_parser<boost::spirit::char_encoding::unicode, T>;
struct sample : symbols<unsigned> {
sample() { add(U"48", 10)(U"🌸", 11); }
};
int main() {
auto test = [](auto const& s) {
boost::u8_to_u32_iterator<decltype(cbegin(s)), char32_t> first{
cbegin(s)},
last{cend(s)};
unsigned parsed_value;
if (x3::parse(first, last, sample{}, parsed_value)) {
std::cout << s << " -> " << parsed_value << "\n";
} else {
std::cout << s << " FAIL\n";
}
};
for (std::string s : {"🌸", "48", "🤷"})
test(s);
}
Prints
🌸 -> 11
48 -> 10
🤷 FAIL