pythonspotifyspotipy

How to handle Spotify API fetching different track IDs over time?


I am currently working on a project that stores files as Spotify playlists. I coincidentally already had a playlist with almost exactly the right amount of tracks, so I used Spotify to fetch all 8192 (2^13) track IDs and made a lookup dictionary between each track ID and a corresponding 13bit string of binary. My program reads the binary and adds the corresponding track IDs to a playlist to write the file to the playlist, and to read the file from the playlist I fetch all the IDs from the playlist and uses the lookup dictionary to convert it back to binary.

The problem I am having: even though I am using the original IDs to add the tracks to the playlists, for a few tracks, Spotify is now fetching a different ID. This is causing errors because the new ID that is being fetched is not in the lookup dict.

For example, one of the track IDs in my dictionary is '1lIkht1mUGFKnDQss3Qk6K' but when I try to read a file from a playlist that had that ID added to is, spotipy fetches '7jokkQ2OZEjngxvDbRzWPs'. Both open.spotify.com/track/1lIkht1mUGFKnDQss3Qk6K and https://open.spotify.com/track/7jokkQ2OZEjngxvDbRzWPs lead to the same song, but if you click to the album from each page, one says single and one says EP even though the one labeled single still has all the songs from the EP. They are obviously different songs, but somehow adding the old ID adds the new ID version of the track, or fetching the ID from the old ID version of the track actually fetches the new ID.

Manually fixing these changed IDs is a lot of work, since you have to locate the ID, and it is essentially a game of whack a mole.

The only solution I can think of is excepting the error during the read process and if the error comes up, do some sort of recalibration process where I add all the IDs to a temporary playlist, then fetch them all again and update the dictionary as necessary, but that would take forever as the API is limited to 100 tracks per request, so that is 160 API requests. This takes exponentially more time and API requests once I eventually implement more track IDs to increase the information density

Is there a better way to handle the situation?


Solution

  • The easiest fix I found was to store an "identifier string" comprised of the track name, artist name(s), and album name for each track in addition to the binary chunk it represents and the track ID. It worked for a while, and then I found that one of the artists that was in multiple songs in my database has changed their name, rendering the identifier strings useless. After examining the API responses I found another field that seems like it would have the least probability of changing. I don't think the ISRC code should change, so I replace the artist name(s) with it.

    Only time will tell if this holds up or if these values are prone to changing as well.