pythondatabasesqliteoperationalerror

SQLite3 and Python 3 - "no such column" error with strings but not integers


For this project, I have a database file with a table RESULTS which looks like this:

conn.execute('''CREATE TABLE RESULTS(
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
age TEXT NOT NULL,
gender TEXT NOT NULL);''')

And this is the file I'm using to create a prototype front end so that users can access the data in the database:

#setting up the user input variables
srcIn = input("what field do you want to search: id, name, age, or gender? ")
srcQuery = input("what result do you want to find from the "+ srcIn + " search? ")
if(srcIn == "age") or (srcIn == "id"):
    #by default input returns a string but it needs to be an int for age and id parameters
    srcQuery = int(srcQuery)
    print("converted to int")
srcOut = input("what field do you want to return: id, name, age, or gender? ")

print("making cursor")
cursor = conn.execute("SELECT %s FROM RESULTS WHERE %s = %s" % (srcOut, srcIn, srcQuery))
for row in cursor:
    print(srcOut + ": " + row[0])

conn.close()
print("closed database " + db)

If I were to run this and pass it age or id, the number 45, and name it would return the names, genders, and other stats from my database perfectly. However, when I pass it gender, and female or male, and returning the names or ages or ids, an error would pop up saying this:

cursor = conn.execute("SELECT %s FROM RESULTS WHERE %s = %s" % (srcOut, srcIn, srcQuery))
sqlite3.OperationalError: no such column: male

For this specific example:

srcOut = name
srcIn = gender
srcQuery = male

I've tried replacing the %s with ? arguments (I know it won't work on the parameters that I'm passing to the SELECT statement), so I'm stumped.

I think that it has something to do with the fact that in certain instances I'm passing it an integer in the srcQuery field and in others I'm passing it a string.

Any help or suggestions would be appreciated.


Solution

  • No, you should not be doing it the way @Suever has suggested.

    Imagine what query would be executed if I provide the following values:

    srcOut = "name"
    srcIn = "gender"
    srcQuery = "' or 1 = 1 -- "
    

    Your code would actually execute:

    SELECT 
        name 
    FROM 
        RESULTS 
    WHERE  
        gender = '' or 1 = 1 -- '
    

    This will match every single row in the table.

    Or, in other words, because of the lack of any validation and escaping and by preparing a specific input I'm getting all the results in a table which is, of course, what you don't want to happen in real world. This is called an SQL injection attack.

    Instead of string formatting, you should parameterize your query. Query parameterization though is not going to work for table and column names - these you need to validate before inserting into the query - since the amount of possible values for scrIn and srcOut is very limited - I'd simply throw an error if you see an unexpected value, e.g.:

    columns = {'id', 'name', 'age', 'gender'} 
    srcIn = input("what field do you want to search: id, name, age, or gender? ")
    if srcIn not in columns:
        raise ValueError("Invalid srcIn value")
    

    As for the srcQuery - this one you need to parameterize:

    query = "SELECT {column_out} FROM RESULTS WHERE {column_in} = ?".format(column_out=srcOut, column_in=scrIn)
    cursor = conn.execute(query, (srcQuery, ))
    

    Aside from being much safer, this would also eliminate the problem of thinking about python-to-database type conversions and quotes - the database driver would handle that automatically.