pythonimdbpy

Can't pass database URL to imdbpy2sql.py script


Whether I'm running imdbpy2sql.py or s32cinemagoer.py the same error occurs:

python3 s32cinemagoer.py /home/username/frozendata 'mysql://imdb:imdb@localhost/imdb'

Traceback (most recent call last):
  File "s32cinemagoer.py", line 197, in <module>
    engine = sqlalchemy.create_engine(db_uri, encoding='utf-8', echo=False)

  File "<string>", line 2, in create_engine
  File "/home/username/.local/lib/python3.8/site-packages/sqlalchemy/util/deprecations.py", line 281, in warned
    return fn(*args, **kwargs)  # type: ignore[no-any-return]
  File "/home/username/.local/lib/python3.8/site-packages/sqlalchemy/engine/create.py", line 680, in create_engine
    raise TypeError(
TypeError: Invalid argument(s) 'encoding' sent to create_engine(), using configuration MySQLDialect_mysqldb/QueuePool/Engine.  Please check that the keyword arguments are appropriate for this combination of component

I've checked SQLAlchemy documentation on create_engine() function, but still don't get what's the issue with the argumetns I am passing in. create_engine function takes several arguments (e.g.engine = create_engine("mysql://scott:tiger@hostname/dbname", encoding='latin1', echo=True) including "encoding" argument.

Here is what SQLAlchemy documentation says about legacy utf-8 encoding in MySQL:

The encoding used for Unicode has traditionally been 'utf8'. However, for MySQL versions 5.5.3 and MariaDB 5.5 on forward, a new MySQL-specific encoding 'utf8mb4' has been introduced, and as of MySQL 8.0 a warning is emitted by the server if plain utf8 is specified within any server-side directives, replaced with utf8mb3. The rationale for this new encoding is due to the fact that MySQL’s legacy utf-8 encoding only supports codepoints up to three bytes instead of four. Therefore, when communicating with a MySQL or MariaDB database that includes codepoints more than three bytes in size, this new charset is preferred, if supported by both the database as well as the client DBAPI.

Here is the set_connection() function from cinemagoer package that probably passes wrong encoding:

def setConnection(uri, tables, encoding='utf8', debug=False):
    """Set connection for every table."""
    params = {'encoding': encoding}
    # FIXME: why on earth MySQL requires an additional parameter,
    #        is well beyond my understanding...
    if uri.startswith('mysql'):
        if '?' in uri:
            uri += '&'
        else:
            uri += '?'
        uri += 'charset=%s' % encoding

I am now trying to run script with SQLAlchemy 1.3, as suggested here and PyMySQL adapter (otherwise I get No module named 'MySQLdb' error), but it doesn`t seem to work either, same issue occurs.


Solution

  • can you try this? just an example, passing encoding directly to db uri

    def setConnection(uri, tables, encoding='utf8', debug=False):
        """Set connection for every table."""
        # FIXME: why on earth MySQL requires an additional parameter,
        #        is well beyond my understanding...
        if uri.startswith('mysql'):
            if '?' in uri:
                uri += '&'
            else:
                uri += '?'
            uri += 'charset=%s' % encoding
    
        engine = sqlalchemy.create_engine(uri, echo=debug)
    
        ......