I have a django project that uses a sqlite database that can be written to by an external tool. The text is supposed to be UTF-8, but in some cases there will be errors in the encoding. The text is from an external source, so I cannot control the encoding. Yes, I know that I could write a "wrapping layer" between the external source and the database, but I prefer not having to do this, especially since the database already contains a lot of "bad" data.
The solution in sqlite is to change the text_factory to something like:
lambda x: unicode(x, "utf-8", "ignore")
However, I don't know how to tell the Django model driver this.
The exception I get is:
'Could not decode to UTF-8 column 'Text' with text'
in
/var/lib/python-support/python2.5/django/db/backends/sqlite3/base.py in execute
Somehow I need to tell the sqlite driver not to try to decode the text as UTF-8 (at least not using the standard algorithm, but it needs to use my fail-safe variant).
The solution in sqlite is to change the text_factory to something like: lambda x: unicode(x, "utf-8", "ignore")
However, I don't know how to tell the Django model driver this.
Have you tried
from django.db import connection
connection.connection.text_factory = lambda x: unicode(x, "utf-8", "ignore")
before running any queries?