pythonencodingwindowserror

How to workaround Python "WindowsError messages are not properly encoded" problem?


It's a trouble when Python raised a WindowsError, the encoding of message of the exception is always os-native-encoded. For example:

import os
os.remove('does_not_exist.file')

Well, here we get an exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
WindowsError: [Error 2] 系統找不到指定的檔案。: 'does_not_exist.file'

As the language of my Windows7 is Traditional Chinese, the default error message I get is in big5 encoding (as know as CP950).

>>> try:
...     os.remove('abc.file')
... except WindowsError, value:
...     print value.args
...
(2, '\xa8t\xb2\xce\xa7\xe4\xa4\xa3\xa8\xec\xab\xfc\xa9w\xaa\xba\xc0\xc9\xae\xd7\xa1C')
>>>

As you see here, error message is not Unicode, then I will get another encoding exception when I try to print it out. Here is the issue, it can be found in Python issue list: http://bugs.python.org/issue1754

The question is, how to workaround this? How to get the native encoding of WindowsError? The version of Python I use is 2.6.

Thanks.


Solution

  • We have the same problem in Russian version of MS Windows: the code page of the default locale is cp1251, but the default code page of the Windows console is cp866:

    >>> import sys
    >>> print sys.stdout.encoding
    cp866
    >>> import locale
    >>> print locale.getdefaultlocale()
    ('ru_RU', 'cp1251')
    

    The solution should be to decode the Windows message with default locale encoding:

    >>> try:
    ...     os.remove('abc.file')
    ... except WindowsError, err:
    ...     print err.args[1].decode(locale.getdefaultlocale()[1])
    ...
    

    The bad news is that you still can't use exc_info=True in logging.error().