[SOLVED] UTF-8 in psycopg3

UTF-8 in psycopg3

I'm refactoring a python code base that is using psycopg2 to psycopg3.

The code has calls to <cursor>.mogrify() which is not present in psycopg3.

The code has something like this ensure that the strings are in UTF-8:

arg_str = cur.mogrify("(%s,%s,%s,%s,%s)", x)
args_list.append(arg_str)
args_str = b','.join(args_list).decode("utf-8")
cur.execute(insert_query + args_str)

In the new codebase, I added client_encoding="UTF8" to the psycopg.connect() call

Do I still need to do something like this ensure that strings will be in UTF8 when passing to execute()?

cur.execute("INSERT INTO data(name) values (%s)", name.encode("utf-8").decode())

Solution

The right way to handle encoding in both psycopg2 and psycopg3 is to

set the encoding on the connection or via SET client_encoding TO ...
pass str values to cursor.execute and let the connector handle encoding (and decoding)

So the correct code using psycopg3 is

cur.execute("INSERT INTO data(name) values (%s)", (name,))

See the docs for string adaptation.

The psycopg2 code in the question seems to be unnecessarily complicated, something like this would have worked just as well (assuming Python3):

x = ('a', 'b', 'c', 'd', 'e')
placeholders = ','.join(['%s'] * len(x))
insert_query = """INSERT INTO tbl (c1, c2, c3, c4, c5) VALUES ({})""".format(placeholders)
cur.execute(insert_query, args)