My goal is to, given a list of id's in Python, find id's not mapped to a row in an SQLite table. I'm trying to achieve this using the EXCEPT
operator:
-- if the table currently stores id1 and id3 would only return id2
WITH cte(id) as VALUES ('id1'), ('id2'), ('id3')
SELECT * from cte EXCEPT SELECT id FROM some_table
I want to specify id's dynamically from a list. I'm able to format strings, hardcoding values:
query = (
"with cte(id) as " +
f"(values {",".join(f"('{id}')" for id in ids)}) " +
"select * from cte except select id from some_table"
)
print(query)
res = cursor.execute(query)
This is vulnerable to SQL Injection. Instead placeholder syntax is preferred. Python sqlite3 documentation show examples with executemany
for INSERT
operations, but how to apply that to a SELECT+EXCEPT single query (which must use execute
and not executemany
)? Alternatively, is there a better way to filter a list of inputs by those which aren't present in a table? Sample of my problem:
import sqlite3
db = sqlite3.connect(":memory:")
cursor = db.cursor()
#
# First create a table of video-id,video-title pairs
#
cursor.execute("CREATE TABLE IF NOT EXISTS videos(id TEXT PRIMARY KEY, title TEXT)")
dummy_data = [
("vid1", "Video 1"),
("vid2", "Video 2"),
("vid3", "Video 3"),
]
# use executemany to insert multiple rows via placeholder VALUES
cursor.executemany("INSERT INTO videos VALUES(?, ?)", dummy_data)
db.commit()
# sanity check that we see the expected videos
res = cursor.execute("SELECT * FROM videos")
print(f"select* result: {res.fetchall()}")
#
# Next, given a set of video ids, find all of the ids not already stored in the DB
#
new_video_ids = ["vid1", "vid2", "vid5"] # vid1 and vid2 already exist in db. only vid5 should be returned
new_video_ids_str = ",".join(f"('{id}')" for id in new_video_ids)
print(new_video_ids_str)
# The following query uses python string formatting and is therefore vulnerable to SQL injection attacks
query = (
"with cte(id) as " +
f"(values {new_video_ids_str}) " +
"select * from cte except select id from videos"
)
print(query)
res = cursor.execute(query)
print(f"filter result: {res.fetchall()}")
# I'd like to use SQLite3 placeholder values but can't figure out the syntax. The following doesn't work.
# it fails since it's trying to all of the `new_video_ids` values as a single row rather than multiple rows.
#
# query = (
# "with cte(id) as " +
# "(values (?)) " +
# "select * from cte except select id from videos"
# )
# res = cursor.execute(query, new_video_ids)
# print(f"filter result: {res.fetchall()}")
db.close()
new_video_ids = ["vid1", "vid2", "vid5"] # vid1 and vid2 already exist in db. only vid5 should be returned
new_video_ids_str = ",".join(
["(?)"] * len(new_video_ids)
)
print(new_video_ids_str)
query = (
"with cte(id) as " +
+ f"(values {new_video_ids_str}) " +
+ "select * from cte except select id from videos"
)
print(query)
res = cursor.execute(query, new_video_ids)
print(f"filter result: {res.fetchall()}")