pythonmysqlpython-2.7utf-8pymssql

Python prints wrong encoding from MySQL database


I'm having issues with my Python when calling data from my MySQL database.

The database is set as UTF-8 and is containing, special letters such as 'Æ' and 'Ø'.

This is the code i use to call the Usernames from my Table

# -*- coding: UTF-8 -*-
import pymysql
db = pymysql.connect(
    host="localhost",
    user="root",
    password="XXXXXXXX",
    db="pythonconnectiontest", 
    charset="utf8"
)
cursor = db.cursor()
cursor.execute("SELECT Username FROM Account")

numrows = cursor.rowcount 
for i in range (numrows):
    row = cursor.fetchone()
    print row

The expected output is:

ThisTextIsBrøken
Tæst 
Word

Actual output:

(u'ThisTextIsBr\xf8ken',)
(u'T\xe6st',)
(u'Word',)

I do get that the 'U' in front of the value, indicates that this is UNICODE and i probably just need to make python interpret the text correctly with encode() or decode().

I've spent many hours trying to figure this out, and the solution is probably very simple.

I hope someone can help me fix this.

Thanks for reading.


Solution

  • The unicode strings it's outputting are perfectly fine. You can verify this by trying print u'ThisTextIsBr\xf8ken' in a python shell:

    ➜ python2
    Python 2.7.15 (default, Jan 10 2019, 23:20:52)
    [GCC 8.2.1 20181127] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> print u'ThisTextIsBr\xf8ken'
    ThisTextIsBrøken
    

    Your confusion is merely about how python represents unicode strings that are wrapped in another object. In this case your rows each represent a tuple of columns or more specifically a tuple containing a single unicode string.

    You can verify this by adjusting your code to print the actual unicode string rather than the tuple containing it.

    # -*- coding: UTF-8 -*-
    import pymysql
    db = pymysql.connect(
        host="localhost",
        user="root",
        password="XXXXXXXX",
        db="pythonconnectiontest", 
        charset="utf8"
    )
    cursor = db.cursor()
    cursor.execute("SELECT Username FROM Account")
    
    numrows = cursor.rowcount 
    for i in range (numrows):
        row = cursor.fetchone()
        print row[0]
    

    This should now output the expected strings.