pythonsqlite

Python3 - Is there a way to iterate row by row over a very large SQlite table without loading the entire table into local memory?


I have a very large table with 250,000+ rows, many containing a large text block in one of the columns. Right now it's 2.7GB and expected to grow at least tenfold. I need to perform python specific operations on every row of the table, but only need to access one row at a time.

Right now my code looks something like this:

c.execute('SELECT * FROM big_table') 
table = c.fetchall()
for row in table:
    do_stuff_with_row

This worked fine when the table was smaller, but the table is now larger than my available ram and python hangs when I try and run it. Is there a better (more ram efficient) way to iterate row by row over the entire table?


Solution

  • cursor.fetchall() fetches all results into a list first.

    Instead, you can iterate over the cursor itself:

    c.execute('SELECT * FROM big_table') 
    for row in c:
        # do_stuff_with_row
    

    This produces rows as needed, rather than load them all first.