I'm currently writing an application in Python that stores its data in a SQLite database. I want the database file to be stored encrypted on disk, and I found the most common solution for doing this to be SQLCipher. I added sqlcipher3 to my project to provide the DB-API, and got started. With SQLCipher, the database encryption key is provided in the form of a PRAGMA statement which must be provided before the first operation on the database is executed.
PRAGMA key='hunter2'; -- like this
When my program runs, it prompts the user for the database password. My concern is that since this is a source of user input, it's potentially vulnerable to SQL injection. For example, a naive way to provide the key might look something like this:
from getpass import getpass
import sqlcipher3
con = sqlcipher3.connect(':memory:')
cur = con.cursor()
password = getpass('Password: ')
cur.execute(f"PRAGMA key='{password}';")
### do stuff with the unencrypted database here
If someone was to enter something like "hunter2'; DROP TABLE secrets;--
" into the password prompt, the resulting SQL statement would look like this after substitution:
PRAGMA key='hunter2'; DROP TABLE secrets;--';
Typically, the solution to this problem is to use the DB-API's parameter substitution. From the sqlite3 documentation:
An SQL statement may use one of two kinds of placeholders: question marks (qmark style) or named placeholders (named style). For the qmark style, parameters must be a sequence whose length must match the number of placeholders, or a
ProgrammingError
is raised. For the named style, parameters must be an instance of adict
(or a subclass), which must contain keys for all named parameters; any extra items are ignored. Here’s an example of both styles:con = sqlite3.connect(":memory:") cur = con.execute("CREATE TABLE lang(name, first_appeared)") # This is the named style used with executemany(): data = ( {"name": "C", "year": 1972}, {"name": "Fortran", "year": 1957}, {"name": "Python", "year": 1991}, {"name": "Go", "year": 2009}, ) cur.executemany("INSERT INTO lang VALUES(:name, :year)", data) # This is the qmark style used in a SELECT query: params = (1972,) cur.execute("SELECT * FROM lang WHERE first_appeared = ?", params) print(cur.fetchall())
This works as expected in the sample code from the docs, but when using placeholders in a PRAGMA statement, we get an OperationalError
telling us there's a syntax error. This is the case for both types of parameter substitution.
# these will both fail
cur.execute('PRAGMA key=?;', (password,))
cur.execute('PRAGMA key=:pass;', {'pass': password})
I'm not sure where to go from here. If we actually enter our malicious string at the password prompt, it won't work, producing the following error:
Traceback (most recent call last): File "<stdin>", line 1, in <module> sqlcipher3.ProgrammingError: You can only execute one statement at a time.
So is the "naive" code from earlier safe? I'm not confident saying the answer is "yes" just because the one malicious string I could come up with didn't work, but there doesn't seem to be a better way of doing this. The answers to the only other person on here asking this question that I could find suggested equivalent solutions (python + sqlite insert variable into PRAGMA statement). I'd also rather not use an ORM, especially if it's just for this one case. Any suggestions would be appreciated, thanks.
According to the accepted answer to “Python sqlite3 string variable in execute”, there are limitations on where DB-API substitutions can be used:
Parameter markers can be used only for expressions, i.e., values. You cannot use them for identifiers like table and column names.
Seeing this, I figured that arguments to PRAGMA
must fall into the same category as “table and column names”. In fact, my specific use case was PRAGMA table_info
, where the argument is a table name.
On digging into it further, I found that Python’s sqlite3
module relies on SQLite’s own sqlite3_bind_*
functions to do parameter substitutions. For example, here is the code for substituting string values. And I found further confirmation that substitution won’t work for PRAGMA
arguments.
“But wait,” I thought. “Sam’s argument is a key, not a table name.” Without digging even deeper, I can only conjecture that it doesn’t matter, and SQLite (or SQLCipher) just doesn’t allow binding values to PRAGMA
statements.
Maybe you can supply the key via SQLCipher’s C API instead of through SQL? It doesn’t fix my use case, but it might help with yours!
For me, and for anyone else trying to programmatically provide a table name to PRAGMA table_info
, I guess the official solution is to double- and triple-check that the variable cannot possibly contain user input, validate and escape it anyway just in case, cross fingers, toes, knees and nose, and do a string substitution! What could possibly go wrong…