pythonsqlitesqlite3-python

CREATE UNIQUE INDEX IF NOT EXISTS still creates new indexes in the Second Iteration


I'm using a database to store some data that I get from an API, and then need to update the database every 5 minutes(still didn't work on that). The issue is that everytime I execute the code again the table gets updated correctly without any duplicates, but the indexes go from 1,2,3,4,.. to 3000,3001,3002,... (the table has ~3000 rows thats why).

This is the code I'm using:

import sqlite3

#opening a (new) database
connection = sqlite3.connect("databases/database.db")
cursor = connection.cursor()

#creating table
cursor.execute('''CREATE TABLE IF NOT EXISTS items (id INTEGER PRIMARY KEY, name TEXT, 
avg24hPrice TEXT, updated TEXT, types TEXT, lastLowPrice TEXT, changeLast48hPercent TEXT, 
iconLink TEXT, wikiLink TEXT,width TEXT, height TEXT, receivedFromTasks TEXT, usedInTasks TEXT,
sellFor TEXT)''')

# Create the unique index if it doesn't exist
cursor.execute('''CREATE UNIQUE INDEX IF NOT EXISTS idx_items_name ON items (name)''')

################################################################
#ALSO TRIED THIS WITHOUT SUCCESS
cursor.execute("PRAGMA INDEX_INFO(idx_items_name)")
index_info = cursor.fetchall()
print("Index_Info: \n", index_info)
if not index_info:
    print("Creating Indexes")
    cursor.execute("DROP INDEX IF EXISTS idx_items_name")
    cursor.execute('''CREATE UNIQUE INDEX IF NOT EXISTS idx_items_name ON items (name)''')
################################################################

#iterating trough a dictionary of items i need to add to the database
for itm in result:
    cursor.execute('''REPLACE INTO items (name, avg24hPrice, updated, types, lastLowPrice,
                      changeLast48hPercent, iconLink, wikiLink, width, height, 
                      receivedFromTasks, usedInTasks, sellFor) 
                      VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
                   ''',(str(itm["name"]), str(itm["avg24hPrice"]), str(itm["updated"]), 
                       str(itm["types"]), str(itm["lastLowPrice"]),str(itm["changeLast48hPercent"]), 
                       str(itm["iconLink"]), str(itm["wikiLink"]), str(itm["width"]), 
                       str(itm["height"]), str(itm["receivedFromTasks"]), str(itm["usedInTasks"]), 
                       str(itm["sellFor"])))

connection.commit()
#closing database & cursor
cursor.close()
connection.close()

Solution

  • When are saying the index goes from 1,2,3.... 3000 ... when there are less than 3000 rows then that is exactly how it works.

    The index, actually a unique identifier of the row (the rowid; the id column being an alias thereof) is 1 greater than the highest *id in the table.

    When you replace a row and it is not the row with the highest id then the row being replaced is deleted and a new row is inserted thus the id will be 1 greater than the highest id and hence for each replace, unless it is the row with the highest id, the highest id will be greater than the number of rows in the table.

    As per

    When a UNIQUE or PRIMARY KEY constraint violation occurs, the REPLACE algorithm deletes pre-existing rows that are causing the constraint violation prior to inserting or updating the current row and the command continues executing normally. If a NOT NULL constraint violation occurs, the REPLACE conflict resolution replaces the NULL value with the default value for that column, or if the column has no default value, then the ABORT algorithm is used. If a CHECK constraint or foreign key constraint violation occurs, the REPLACE conflict resolution algorithm works like ABORT.

    When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, delete triggers fire if and only if recursive triggers are enabled.

    The id has nothing at all to do with the index that you have created, they are separate indexes; the rowid index exists for all tables unless the table is defined using WITHOUT ROWID or is a virtual table. The index on the name column is a user defined index. However, it is the UNIQUE conflict on the name column that is the trigger for the REPLACE as you are not providing a value for the id column.

    Note that if you coded AUTOINCREMENT (and it is NOT recommended that you do) then the behaviour is slightly different. That is the highest id (alias of the rowid) is stored in a system table sqlite_sequence and the id would then be 1 greater than the higher of the recorded/saved id. So if the row with the highest id were replaced then the id would be 1 greater (where it would not be if not using AUTOINCREMENT).

    You may wish to refer to https://www.sqlite.org/autoinc.html, which also covers the rowid column.