I am attempting to unit test a function of mine that checks if something is in a database based on a bunch of "identifiers". This is my very first time doing anything with sqlite3 or SQL at all so it's possible I'm making some very stupid mistake. This function has worked when I ran it on actual data, but for some reason is failing in this testing version. I am using Python 3.9, and sqlite3.
After adding several statements to try and debug, I've ended up with this in my code: (there's actually more columns but I shortened it, it just adds more AND to the WHERE statement, also these are obviously not the variable names in my actual code)
import sqlite3
class TestObject:
def __init__(self, db_loc):
self.con = sqlite3.connect(db_loc)
self.cur = self.con.cursor()
def get_thingy(self, row):
# this is applied to a pandas dataframe with .apply(lambda row: self.get_thingy(row), axis=1) so row is a pd.Series
params = (row.col1, row.col2, str(row.col3), row.col4, row.col5)
print(params)
res = self.cur.execute("""
SELECT thingy FROM thingies
WHERE (col1 = ? or col2 = ?) AND col3 = ? AND col4 = ? AND col5 = ?""", params).fetchone()
print(res)
print(self.cur.execute("""SELECT thingy FROM thingies
WHERE (col1 = 1 or col2 = 2) AND col3 = '3' AND col4 = 4 AND col5 = 5""").fetchone())
# and then more code that does stuff with the retrieved thingy
def get_thingies(self):
self.thingy_table['thingy'] = self.thingy_table.apply(lambda row: self.get_thingy(row), axis=1)
while my test looks like
import pytest
import os
import pandas as pd
from [path to the other file] import TestObject
class TestClass:
@pytest.fixture
def testobj(self):
db_path = 'test.db'
test = TestObject(db_path)
yield test
if test.con: test.con.close()
os.remove(db_path)
def test_get_thingy(self, testobj):
testobj.cur.execute("""CREATE TABLE thingies(col1, col2, col3, col4, col5, thingy)""")
testobj.cur.execute("""INSERT INTO thingies VALUES (1, 2, '3', 4, 5, "yay thingy")""")
testobj.con.commit()
testobj.get_thingy(pd.Series({'col1': 1, 'col2': 2, 'col3': 3, 'col4': 4, 'col5': 5}))
Running this test prints
(1, 2, '3', 4, 5)
None
('yay thingy',)
I have tried copy/pasting the printed params
list in to replace the params
keyword in the .execute() command, and this caused both queries to return yay thingy
. So something is strange about the way that I am accessing these values.
Why does it not work with params
as an input, but if I print params
and copy what is printed, that works? Shouldn't it be the exact same? Is something strange about taking these values from a pd.Series? The get_thingies function worked on actual data in a DataFrame, is a row of a DataFrame not actually a Series like I thought it was? I'm losing my mind.
Found my solution, posting as an answer in case anyone else finds this through Google or something.
Anon Coward's comment about needing .item() did make the test work, however it broke the actual code when I ran it on data, so obviously I had to find another solution.
It turns out, all I had to do was make the numbers in the test into floats instead of integers - so the INSERT and final lines of my test turned into
testobj.cur.execute("""INSERT INTO thingies VALUES (1.0, 2.0, "3.0", 4.0, 5.0, "yay thingy")""")
and
testobj.get_thingy(pd.Series({'col1': 1.0, 'col2': 2.0, 'col3': 3.0, 'col4': 4.0, 'col5': 5.0}))
This makes the test work without having to change the code I'm testing.