pythonunicodeutf-8pypyrpython

RPython ord() with non-ascii character


I'm making a virtual machine in RPython using PyPy. My problem is, that I am converting each character into the numerical representation. For example, converting the letter "a" provides this result, 97. And then I convert the 97 to hex, so I get: 0x61.

So for example, I'm trying to convert the letter "á" into the hexadecimal representation which should be: 0xe1 but instead I get 0xc3 0xa1

Is there a specific encoding I need to use? Currently I'm using UTF-8.

--UPDATE--

Where instr is "á", (including the quotes)

for char in instr:
    char = str(int(ord(char)))
    char = hex(int(char))
    char = char[2:]
    print char # Prints 22 C3 A1 22, 22 is each of the quotes
    # The desired output is 22 E1 22

Solution

  • #!/usr/bin/env python
    # -*- coding: latin-1 -*-
    
    char = 'á'
    
    print str(int(ord(char)))
    print hex(int(char))
    print char.decode('latin-1')
    

    Gives me:

    225
    0xe1
    0xe1