I made the mistake of accidentally using non-ascii characters in a form that was submitted into a database using SQLAlchemy, running on Flask. Basically, rather than using the ASCII hyphen –
, I used the unicode en-dash –
. I am trying to now go back and replace all occurrences of the en-dash with a hyphen in my database.
Let's say I have a users table, and the column I'm trying to change is called occupation
. I'm able to figure out which entries in my database have the invalid character, because when I run:
User.query.get(id)
if the user has an invalid ASCII character, it returns
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 167: ordinal not in range(128)
So how can I go about replacing all occurrences of the en-dash with a hyphen in the occupation
column for all rows in my DB?
I was able to fix this by running a script over all entries in my database, and replacing the ones with faulty characters.
from user.models import *
for u in User.query.all():
# \u2013 is unicode for en-dash
if u"\u2013" in u.occupation:
# replace with normal hyphen
u.occupation = u.occupation.replace(u"\u2013", "-")
db.session.commit()