database sqlite optimization database-design

What approach would be best for SQL dynamically-ordered lists (i.e. music playlists)?

I am designing a database for a music player. I am already using SQLite to store music tracks in a table called tracks, containing a unique id.

I need to add very standard playlist functionality: playlists have a name, and can contain any number of tracks, possibly several instances of the same track, in a specific order. It should be possible to easily add/remove/reorder tracks in a playlist, as well as add/delete/duplicate playlists. I am not sure if SQL, or another solution, would be the best in terms of query and write efficiency.

My initial thought was to use a playlists table with a simple id and name, and use an associative playlist_tracks table with playlist_id, track_id, and track_index.

However, it seems this approach has several downsides:

It is not efficient to do many "normal" operations.
- If I add a track in the middle of playlist, all tracks after it need their index incremented. If there's 5000 tracks, and I added in the middle, that's 2500 rows to query and UPDATE with a WHERE track_id > 2500. I don't know how long this would lock the database for, especially in a transaction - ideally it should be perceptively instant to the user. Maybe it would be and I am concerned for no reason? I am not experienced enough with SQLite speeds to know.
- Same concern for re-ordering or removing existing tracks, or deleting entire playlists.
- Even the normally cheapest array operation of "add to end of playlist" isn't trivial, as it's first needed to query the database to find the largest track_index for a given playlist_id.
No guarantee of order being actually sequential. There's no inherent limitation in the database that ensures if a row exists WHERE playlist_id = 1 AND track_index = 100, that a row also exists WHERE playlist_id = 1 AND track_index = 99. That requirement is left up to the implementation to rigorously update all indices without fail, and leaves room for human error. In contrast, for an array of length N, the fact that all indices up to N exist is an inherent result of the data structure, since elements have no explicit index apart from their offset in the chunk of memory the array occupies.

This leads me to doubt the SQL approach and consider if it would be both simpler and more efficient to just a file per playlist containing a sequence of track_id. This is slightly less queryable for tasks like "which playlists contain track X", but that is a way rarer need than "insert track X at index Y in playlist Z".

There might be a better option.

What approach would be best for such SQL dynamically-ordered lists?

Unlike Best representation of an ordered list in a database? I am asking about a many-to-many relationship (i.e. any number of playlists can have any number of tracks). Furthermore, I am not asking specifically about SQL.

Which of SQL or a serialized array is the better approach?

What are the upsides/downsides?

Solution

I believe that you would be surprised how efficient SQLite can be and also what it can do.

The following is intended to be a pointer to "Perhaps reconsider SQLite's abilities" rather than being "the solution".

Perhaps consider a simple but not too efficient means of setting a unique track_index applied after the insert (perhaps as a TRIGGER), when the insert sets the track_index to a "special" value such as -1.

Further consider a database with some 500 playlists, 100000 tracks and that the update is applied after the insertion of something like 10000 rows.

Using a Windows laptop and an SQLite tool (Navicat) then:-

/* Mass update*/
UPDATE playlist_tracks AS pt01
    SET track_index = coalesce((SELECT max(track_index) FROM playlist_tracks WHERE playlist_id = pt01.playlist_id),0) + 1
    WHERE pt01.track_index < 1
> Affected rows: 9978
> Time: 0.087s

copied and pasted from an actual run
the above is not intended to be The Solution, the intention is just to show that SQLite can be quite efficient

So 9978 updates in less than 1/10th of a second. All of this in a single transaction. However, results will differ depending upon the device.

I am asking about a many-to-many relationship

That doesn't really complicate matters as the track_index appears to be a column in the associative table and thus the related tables and the relationship is unchanged. i.e. it is just the single table being manipulated.

As mentioned above a TRIGGER or TRIGGERS could automate matters.

Perhaps consider something like:-

/* Create a trigger for making a gap if necessary for the track_index for the respective playlist */
CREATE TRIGGER IF NOT EXISTS the_after_playlist_tracks_insert_trigger 
BEFORE INSERT ON playlist_tracks
BEGIN
    UPDATE playlist_tracks 
    SET track_index = track_index + 1 WHERE track_index >= new.track_index AND playlist_id = new.playlist_id
    ;
END;

This will before actually inserting a playlist_tracks row increment the track_index for all existing rows that are the same as the provided track_index or greater thus making a gap for the row that is being inserted.
oops name of the trigger not changed to before rather than after (initially tested/setup as AFTER and forgot to change the name accordingly)

e.g. (with the 100000 tracks and 500 playlists etc as per the UPDATE) :-

INSERT INTO playlist_tracks VALUES (10,30,5)
> Affected rows: 1
> Time: 0.023s

See the note re using BEFORE triggers in the link above

A suggestion is to use an SQLite tool and to test. The above as previously mentioned was undertaken using Navicat for SQLite. There are plenty of other tools.

The entire code used follows and could be considered as an example of how you can test methodologies with such a tool. You may find this useful:-

DROP TABLE IF EXISTS playlist_tracks;
DROP TABLE IF EXISTS tracks;
DROP TABLE IF EXISTS playlists;
DROP INDEX IF EXISTS playlist_tracks_idxon_track_id;
CREATE TABLE IF NOT EXISTS tracks (id INTEGER PRIMARY KEY, trackname TEXT, etc WHATEVER);
CREATE TABLE IF NOT EXISTS playlists (id INTEGER PRIMARY KEY, playlistname TEXT, etc);
CREATE TABLE IF NOT EXISTS playlist_tracks (
    playlist_id INTEGER, 
    track_id INTEGER, 
    track_index INTEGER, 
    PRIMARY KEY (playlist_id,track_id)
);
CREATE INDEX IF NOT EXISTS playlist_tracks_idxon_track_id ON playlist_tracks (track_id);
/* Add some basic data */
INSERT INTO tracks (trackname) VALUES
    ('trackA'),('trackB'),('trackC'),('trackD'),('Money'),('Brain Damage'),('Any Colour You Like'),('Money for Nothing'),('Sultans of Swing'),('Dreamer'),('Crime Of The Century'),('Rudy'),('Benny The Bouncer'),('The Gnome'),('Help');
INSERT INTO playlists(playlistname) VALUES('PL001'),('PL010'),('PL100');
/* add some tracks to the playlists */
INSERT INTO playlist_tracks VALUES
    (1,1,1),(1,5,2),(1,9,5),(1,3,3),
    (2,1,99),(2,7,2),(2,6,1),
    (3,14,1),(3,8,2),(3,4,3),(3,11,4)
;
/* Add some playlist_tracks where track_index is -1 (to be updated) */
INSERT OR IGNORE INTO playlist_tracks
    VALUES (
    (SELECT id FROM playlists ORDER BY random() LIMIT 1),
    (SELECT id FROM tracks ORDER BY random() LIMIT 1),
    -1
    )
;
INSERT OR IGNORE INTO playlist_tracks
    VALUES (
    (SELECT id FROM playlists ORDER BY random() LIMIT 1),
    (SELECT id FROM tracks ORDER BY random() LIMIT 1),
    -1
    )
;
INSERT OR IGNORE INTO playlist_tracks
    VALUES (
    (SELECT id FROM playlists ORDER BY random() LIMIT 1),
    (SELECT id FROM tracks ORDER BY random() LIMIT 1),
    -1
    )
;
/* Output 1 current results */
SELECT track_index,playlists.*,tracks.* FROM playlists 
    JOIN playlist_tracks ON playlists.id = playlist_tracks.playlist_id 
    JOIN tracks ON playlist_tracks.track_id = tracks.id
    ORDER BY playlists.id,playlist_tracks.track_index
;
/* Mass update to change -1 values */
UPDATE playlist_tracks AS pt01
    SET track_index = coalesce((SELECT max(track_index) FROM playlist_tracks WHERE playlist_id = pt01.playlist_id),0) + 1
    WHERE pt01.track_index < 1
;
/* Output 2 results after update */
SELECT track_index,playlists.*,tracks.* FROM playlists 
    JOIN playlist_tracks ON playlists.id = playlist_tracks.playlist_id 
    JOIN tracks ON playlist_tracks.track_id = tracks.id
    ORDER BY playlists.id,playlist_tracks.track_index
;

/* Generate some 100000 tracks via a recursive CTE */
WITH cte_counter1(x) AS
    (SELECT 1 UNION ALL SELECT x+1 FROM cte_counter1 LIMIT 100000)
INSERT OR IGNORE INTO tracks (trackname) SELECT 'Generated track '||(coalesce(x,1)*5) FROM cte_counter1;
/* Generate some 500 playlists via a recursive CTE */
WITH cte_counter2(x) AS
    (SELECT 1 UNION ALL SELECT x+1 FROM cte_counter2 LIMIT 500)
INSERT OR IGNORE INTO playlists (playlistname) SELECT 'Generated PlaylistName '||(coalesce(x,1)*3) FROM cte_counter2;

/* Insert 10000 playlist_tracks randomly BUT always with  -1 track_index*/
WITH
    cte_max_trkid(trkid) AS (SELECT max(id) FROM tracks),
    cte_max_plid(plid) AS (SELECT max(id) FROM playlists),
    cte_for_insert(p,t,ti) AS
    (
        SELECT 
            abs(random()% (SELECT plid FROM cte_max_plid)),
            abs(random()% (SELECT trkid FROM cte_max_trkid)),
            -1
        UNION ALL
        SELECT
            abs(random()% (SELECT plid FROM cte_max_plid)),
            abs(random()% (SELECT trkid FROM cte_max_trkid)),
            ti
        FROM cte_for_insert LIMIT 10000
    )
INSERT OR IGNORE INTO playlist_tracks SELECT * FROM cte_for_insert WHERE p IN (SELECT id FROM playlists) AND t IN (SELECT id FROM tracks);

    
/* Mass update to change the track_index*/
UPDATE playlist_tracks AS pt01
    SET track_index = coalesce((SELECT max(track_index) FROM playlist_tracks WHERE playlist_id = pt01.playlist_id),0) + 1
    WHERE pt01.track_index < 1
;

/* Output 3 results after the track_index adjustments */
SELECT track_index,playlists.*,tracks.* FROM playlists 
    JOIN playlist_tracks ON playlists.id = playlist_tracks.playlist_id 
    JOIN tracks ON playlist_tracks.track_id = tracks.id
    ORDER BY playlists.id,playlist_tracks.track_index
;

/* Create a trigger for making a gap if necessary for the track_index for the respective playlist */
CREATE TRIGGER IF NOT EXISTS the_after_playlist_tracks_insert_trigger 
BEFORE INSERT ON playlist_tracks
BEGIN
    UPDATE playlist_tracks 
    SET track_index = track_index + 1 WHERE track_index >= new.track_index AND playlist_id = new.playlist_id
    ;
END;

/* Output 4 the playlist before the insert */
SELECT track_index,playlists.*,tracks.* FROM playlists 
    JOIN playlist_tracks ON playlists.id = playlist_tracks.playlist_id 
    JOIN tracks ON playlist_tracks.track_id = tracks.id
    WHERE playlists.id = 10
    ORDER BY playlists.id,playlist_tracks.track_index
;

/* The insert requireing a gap so it is the 5th in the playlist moving the current 5th and all other higher rows up by 1 */
INSERT INTO playlist_tracks VALUES (10,30,5);

/* Output 5 thr playlit after the insert */
SELECT track_index,playlists.*,tracks.* FROM playlists 
    JOIN playlist_tracks ON playlists.id = playlist_tracks.playlist_id 
    JOIN tracks ON playlist_tracks.track_id = tracks.id
    WHERE playlists.id = 10
    ORDER BY playlists.id,playlist_tracks.track_index
;

/* CLEANUP TEST ENVIRONMENT*/
DROP TABLE IF EXISTS playlist_tracks;
DROP TABLE IF EXISTS tracks;
DROP TABLE IF EXISTS playlists;
DROP INDEX IF EXISTS playlist_tracks_idxon_track_id;

In regard to the TRIGGER the before the insert output shows, for example :-

After then:-