I am trying to read a csv into an sqlite database with Qt SQL (pyside6) but the method I have tried is far too slow.
The sqlite3 module in python has an executemany() function which can insert ~1 million rows very quickly. Using a list as a replacement for csv data:
import sqlite3
# Connect to database
con = sqlite3.connect('./db0.db')
cur = con.cursor()
# Create table
cur.execute('''CREATE TABLE foo (
id INTEGER PRIMARY KEY AUTOINCREMENT UNIQUE NOT NULL,
col1 TEXT(30),
col2 TEXT(30))''')
# Insert data
data = list(("foo", "bar") for i in range(0, 1000000))
cur.executemany("INSERT INTO foo (col1, col2) VALUES (?, ?)", data)
# Close connection
con.commit()
con.close()
In pyside6, the execBatch() method (doc) seems to do something similar but is extremely slow, ~1+ hr (I didn't wait for it to finish). QVariants are mentioned in the documentation but have since been removed. Am I missing something or is there a better method of inserting larger amounts of data?
from PySide6.QtSql import (
QSqlDatabase,
QSqlQuery
)
# Connect to database
con = QSqlDatabase.addDatabase("QSQLITE")
con.setDatabaseName('./db0.db')
con.open()
# Create table
query = QSqlQuery()
query.exec('''CREATE TABLE bar (
id INTEGER PRIMARY KEY AUTOINCREMENT UNIQUE NOT NULL,
col1 TEXT(30),
col2 TEXT(30)
)''')
# Insert data
c1 = list("foo" for i in range(0, 100000))
c2 = list("bar" for i in range(0, 100000))
query.prepare("INSERT INTO bar (col1, col2) VALUES (?, ?)")
query.addBindValue(c1)
query.addBindValue(c2)
query.execBatch()
# Close connection
query.finish()
con.close()
I would prefer to use QT's SQL support over the sqlite3 module for model views.
As suggested in the comments, performance can be improved by starting a transaction, then using a loop and the exec()
method, as opposed to execBatch()
. The insertion finished in a couple of seconds on my machine.
con.transaction()
query = QSqlQuery()
query.prepare("INSERT INTO bar (col1, col2) VALUES (?, ?)")
# Insert data
c1 = list("foo" for i in range(0, 1000000))
c2 = list("bar" for i in range(0, 1000000))
query.prepare("INSERT INTO bar (col1, col2) VALUES (?, ?)")
for a, b in zip(c1, c2):
query.bindValue(0, a)
query.bindValue(1, b)
query.exec()
con.commit()