According to documentation g_print()
is supposed to receive UTF8 string and print it. But it does print garbage. The simple printf()
prints text correctly.
#include <glib-2.0/glib.h>
#include <stdio.h>
int main() {
guchar val[] = "яблоко";
g_print("Fruit is %s\n", val);
printf("Fruit is %s\n", val);
printf("sizeof(val)=%ld\n", sizeof(val));
for(int idx=0; val[idx]; idx++) {
printf("%02x ", val[idx]);
}
printf("\n");
return 0;
}
The output is:
Fruit is ??????
Fruit is яблоко
sizeof(val)=13
d1 8f d0 b1 d0 bb d0 be d0 ba d0 be
The last two line shows that val
array really does have UTF8 string.
Why g_print()
prints garbage? What am I missing?
Found a solution, not sure if it is correct one, but it works for me right now.
Apparently, the g_print()
does convert UTF8 symbols into question marks if it believes that console is not UTF8 compatible: https://docs.gtk.org/glib/warnings.html#encoding
And yes, using setlocale()
did solve my problem. Still, not entirely sure why.
Adding to the previous sample this code
const char *charset;
g_get_console_charset(&charset);
g_print("charset = %s\n", charset);
setlocale(LC_ALL, "");
g_get_console_charset(&charset);
g_print("charset = %s\n", charset);
Provided additional output:
charset = ANSI_X3.4-1968
charset = UTF-8
So, my application, at the start, believes that console is using ANSI_X3.4-1968
code page. No idea where this came from. But setlocale(LC_ALL,"")
did reset the console expectation to UTF8 - after that g_print()
start working properly (and printf()
continue to work properly.