pythonunicodetranslationgettextpot

gettext: how to avoid fail at unicode character?


Here's a python file that causes gettext to fail at the unicode character code \u2191.

texts = {
    'first': _(u'Hello world'),
    'fails': _(u'Arrow: \u2191'),  # This code causes problems for gettext
    'omitted': _(u'Innocent string here')
}

When running C:\Python27\pythonw.exe C:\Python27\Tools\i18n\pygettext.py -d string_file string_file.py in the command line, the result POT file contains the correct header but fails when encountering the unicode arrow:

#: translate.py:2
msgid "Hello world"
msgstr ""

#: translate.py:3
msgid

What can I do to get it to work with the unicode character code?


Solution

  • A workaround is to remove the codes from the to-be-translated strings

    # Not wrapped in _() so does not enter gettext
    arrrow_char = u'\u2191'
    
    # These are now accessible to gettext
    texts = {
        'first': _(u'Hello world'),
        'fails': _(u'Arrow: %s') %arrow_char,  # No longer causes a problem
        'omitted': _(u'Innocent string here')
    }