I have two distincts files called:
'╠.txt' and '¦.txt'
Such simple code:
files = os.listdir('E:\pub\private\desktop\')
for f in files:
print f, repr(f), type (f)
which would return
¦.txt '\xa6.txt' <type 'str'>
¦.txt '\xa6.txt' <type 'str'>
I don't get why I am getting the code 0xA6 for the ╠ character instead of OxCC. I have been trying to play arround with the encode-decode methode but without success. I have noticed that sys.getfilesystemencoding() is set mbcs - but I can't manage to change it something like cp437.
Any help is very much appreciated. Thanks!
You have to pass a unicode string to os.listdir
and Python will return unicode filenames:
# a string that is unicode+raw (escapes \)
path = ur"E:\pub\private\desktop"
print os.listdir(path)
# [u'\xa6.txt', u'\u2560.txt']
Windows NT actually uses unicode for filenames, but I guess Python tries to encode them when you pass a encoded path name.