I am trying to implement a C++ Translate
function for localization.
// Language package, containing key-value pairs of translation, e.g.,
// g_LanguagePack["HELLO@1"] = "Hello, {}!"
// The "@N" suffix indicates that this format string has N parameters.
// When there is no parameter, the suffix can be omitted.
std::unordered_map<std::string, std::string> g_LanguagePack;
// The translator function
template <typename VA...>
std::sting Translate(const std::string& key, VA&&... params);
When invoked, e.g., Translate("HELLO@1", "FOO")
will do a lookup in the language package and return the localized string "Hello, FOO!"
.
key
is guaranteed to be a compile-time string (so key
's type may need to be changed), and in practice developers may provide mismatching number of parameters, or missing @N
while providing parameters. So I think it is necessary to add a check mechanism to ensure N == sizeof...(VA)
.
At the beginning, I used static_assert
in Checker
, and it failed because static assertion expression is not an integral constant expression. Then I learned User-defined literal string: compile-time length check that I can directly use assert in consteval
functions, and it works.
However, the GPT said it's not recommended to call assert
in consteval
functions (I am not sure about it, since it compiles on both Clang and MSVC). And, if it is not recomended, what could be a better implementation?
template <std::size_t N, typename... VA>
consteval void Checker(const char (&key)[N], VA&&... va)
{
std::string_view string_view(key, N - 1);
std::size_t param_cnt = 0;
auto indicator_index = string_view.find('@');
if (indicator_index == std::string_view::npos) // no params
{
assert(sizeof...(VA) == 0);
}
else
{
// parse param_cnt_
for (auto i = indicator_index + 1; string_view.begin() + i != string_view.end(); i++)
{
auto digit = string_view.at(i); // get digit
assert('0' <= digit && digit <= '9');
param_cnt = param_cnt * 10 + digit - '0';
}
assert(sizeof...(va) == param_cnt);
}
}
int main(int argc, char* argv[])
{
Checker("foo@2", 1, 2);
Checker("foo@1", "string");
return 0;
}
Then comes the tricky part. I tried to use Checker in Translate
, unfortunately, it did not work. It's because key
, when passed as parameter, is no longer guaranteed to be a compile-time constant.
template <std::size_t N, typename... VA>
std::string Translate(const char (&key)[N], VA&&... va)
{
Checker(key, va...); // Function parameter 'key' with unknown value cannot be used in a constant expression
// do translation
return "";
}
This is a very similar problem to what fmt::format
and now std::format
want to do: type-check the format string:
std::format("x={}"); // compile-time error (missing argument)
std::format("x={}", 1); // ok
std::format("x={:d}", "not a number"); // compile-time error (bad specifier)
The mechanism by which this works is pretty clever. You think of the signature to format
as being:
template <typename... Args>
auto format(string_view, Args&&...) -> string;
But it's really this:
template <typename... Args>
auto format(format_string<Args...>, Args&&...) -> string;
where:
template <typename... Args>
struct basic_format_string {
template <class S> requires std::convertible_to<S, std::string_view>
consteval basic_format_string(S s) {
std::string_view sv = sv;
// now parse the thing
}
};
template <typename... Args>
using format_string = basic_format_string<type_identity_t<Args>...>;
That is: when you call format("x={}")
that is going to try to initialize basic_format_string<>
from "x={}"
. That constructor is consteval
. It's in that constructor that the format string is parsed. If that parsing fails, you just do some non-constant-expression operation and that will case the whole expression to fail.
So you just have to do the exact same thing:
template <typename... Args>
struct basic_format_string {
std::string_view sv;
template <class S> requires std::convertible_to<S, std::string_view>
consteval basic_format_string(S s) : sv(s) {
auto idx = sv.find('@');
if (idx == sv.npos) {
if (sizeof...(Args) != 0) {
throw "expected no arguments";
}
} else {
int v;
auto [p, ec] = std::from_chars(sv.data() + idx + 1, sv.data() + sv.size(), v);
if (ec == std::errc() and p == sv.data() + sv.size()) {
if (sizeof...(Args) != v) {
throw "wrong number of arguments";
}
} else {
throw "invalid arg";
}
}
}
};
template <typename... Args>
using format_string = basic_format_string<std::type_identity_t<Args>...>;
template <typename... VA>
std::string Translate(format_string<VA...> fmt, VA&&... va)
{
// use fmt.sv and va...
return "something";
}
Which you can see work here:
int main() {
Translate("foo@2", 1, 2); // ok
Translate("foo@3", 1, 2); // compile-time error (wrong number of arguments)
}