sql-serverpython-3.xpyodbcexecutemany

pyodbc fast_executemany on Linux garbles strings on insert


I have the following code, which works fine on Windows, and fails on Linux:

import pyodbc

conn = pyodbc.connect(p_str = None, server = ..., app = ..., databsae = ...,
                      driver = '{ODBC Driver 17 for SQL Server}',
                      Trusted_Connection = 'yes')

conn.setdecoding(pyodbc.SQL_CHAR, encoding = 'utf-8')
conn.setdecoding(pyodbc.SQL_WCHAR, encoding = 'utf-8')
conn.setencoding(encoding = 'utf-8')

sql = 'INSERT INTO TestStrTbl(idKey,idValue) VALUES (?,?)'
data = [('one', 'value1'), ('two', 'value2')]

cursor = conn.cursor()
cursor.connection.autocommit = False
cursor.fast_executemany = True
cursor.executemany(sql, data)
cursor.commit()

I am inserting into and empty SQL Server table, created with the following SQL:

CREATE TABLE TestStrTbl
(
    idKey   varchar(20) NOT NULL PRIMARY KEY,
    idValue varchar(20) NOT NULL
)

The error returned is:

Violation of PRIMARY KEY constraint 'PK__TestStrT__3FBEE7404FA9AB3B'. Cannot insert duplicate key in object 'dbo.TestStrTbl'. The duplicate key value is (?).

When I use the same connection to read, or insert using complete strings, like

sql = "INSERT INTO TestStrTbl(idKey,idValue) VALUES ('%s','%s')"
cursor.connection.autocommit = False
cursor.fast_executemany = True
for row in data:
    cursor.execute(sql % row)
cursor.commit()

this works just fine. Another possibility is to let cursor.fast_executemany = False, then it will work on Linux as well. What could I do to fix this?

I am running Python 3.7.8 with pyodbc==4.0.24 on Ubuntu Linux. Thanks for your help.


UPDATE

Here are the unixODBC settings:

18:48:56 $> odbcinst -j
unixODBC 2.3.1
DRIVERS............: /usr/local/etc/odbcinst.ini
SYSTEM DATA SOURCES: /usr/local/etc/odbc.ini
FILE DATA SOURCES..: /usr/local/etc/ODBCDataSources
USER DATA SOURCES..: /home/myUserId/.odbc.ini
SQLULEN Size.......: 8
SQLLEN Size........: 8
SQLSETPOSIROW Size.: 8

Solution

  • There is no need to call conn.setdecoding and conn.setencoding when working with Microsoft's ODBC drivers for SQL Server. The default encoding for pyodbc is UTF16-LE, which is what Microsoft's ODBC drivers use.

    As for unixODBC, anything prior to version 2.3.5 (2018-01-02) should be considered "old" as that was a major bugfix release. Version 2.3.7 (2018-08-10) fixed several more bugs and is currently the recommended version.