pythondecodecp1251

Python: decode russian string


I recieved a list of tuples from a mySQL database.
When I try to print an item, here is the result:

Далоев ÐлекÑандр
<class 'str'>

This is cp1251, according to https://2cyr.com/decode/?lang=ru

I have tried lots of variations of .encode().decode() with errors='ignore' params but without any success. Any ideas?

UPD I recieve my list of tuples with mysql-connector-python.

z is the list. The result above is from z[0][0]

def select_name(add):
z = []
try:
    dbconfig = read_db_config()
    conn = MySQLConnection(**dbconfig)
    cursor = conn.cursor()
    cursor.execute("select name from phone_add where ph_add = " + str(add) + ";")

    row = cursor.fetchone()
    while row is not None:
        z.append(row)
        row = cursor.fetchone()
    return z

except Error as e:
    print(e)

finally:
    cursor.close()
    conn.close()

Upd2 Here is a wierd decoder. Hope it will help smb.

I realised that the problem is in inserting to my DB. Will dig here.

q = string

codings = ['latin1', 'utf8', 'cp1251', 'unicode-escape', 'cp866']
exceptions = ['ignore', 'strict', 'xmlcharrefreplace', 'backslashreplace']
for i in codings:
    for j in codings:
        for z in exceptions:
            for p in exceptions:
                try:
                    print(q.encode(i, errors=z).decode(j, errors=p) + '<------' + i + ' ' + j + ' ' + z + ' ' + p)
                except:
                    pass

Solution

  • The problem was in database. The sting was already damaged during the insertion. I tried mysql_set_charset('utf8'); in my insertion script and everything went allright.