pythonpython-2.xurdu

accessing characters of urdu script


I have following string

test="ن گ ب ن د ی ک ر و ا ن "

what I want is that I want to access each character and save it in some variables for future access but when I looped over them I got weird output.Actually I am not aware of encoding schemes that much.

for i in test:
    print(i)

above code gave me some weird characters what I want is the original script characters?


Solution

  • Either define test as a unicode string, or use the decode method:

    test="ن گ ب ن د ی ک ر و ا ن"
    for i in test.decode('utf8'):
        print(i)
        # print unicode value
        print(repr(i))
    
    test=u"ن گ ب ن د ی ک ر و ا ن"
    for i in test:
        print(i)
        # print unicode value
        print(repr(i))
    

    Obviously my answer concerns Python 2.7.x.