encodinghashspotifybarcode

Encoding Spotify URI to Spotify Codes


Spotify Codes are little barcodes that allow you to share songs, artists, users, playlists, etc.

They encode information in the different heights of the "bars". There are 8 discrete heights that the 23 bars can be, which means 8^23 different possible barcodes.

Spotify generates barcodes based on their URI schema. This URI spotify:playlist:37i9dQZF1DXcBWIGoYBM5M gets mapped to this barcode:

Spotify code barcode

The URI has a lot more information (62^22) in it than the code. How would you map the URI to the barcode? It seems like you can't simply encode the URI directly. For more background, see my "answer" to this question: https://stackoverflow.com/a/62120952/10703868


Solution

  • The patent explains the general process, this is what I have found.

    This is a more recent patent

    When using the Spotify code generator the website makes a request to https://scannables.scdn.co/uri/plain/[format]/[background-color-in-hex]/[code-color-in-text]/[size]/[spotify-URI].

    Using Burp Suite, when scanning a code through Spotify the app sends a request to Spotify's API: https://spclient.wg.spotify.com/scannable-id/id/[CODE]?format=json where [CODE] is the media reference that you were looking for. This request can be made through python but only with the [TOKEN] that was generated through the app as this is the only way to get the correct scope. The app token expires in about half an hour.

    import requests
    
    head={
    "X-Client-Id": "58bd3c95768941ea9eb4350aaa033eb3",
    "Accept-Encoding": "gzip, deflate",
    "Connection": "close",
    "App-Platform": "iOS",
    "Accept": "*/*",
    "User-Agent": "Spotify/8.5.68 iOS/13.4 (iPhone9,3)",
    "Accept-Language": "en",
    "Authorization": "Bearer [TOKEN]", 
    "Spotify-App-Version": "8.5.68"}
    
    response = requests.get('https://spclient.wg.spotify.com:443/scannable-id/id/26560102031?format=json', headers=head)
    
    print(response)
    print(response.json())
    

    Which returns:

    <Response [200]>
    {'target': 'spotify:playlist:37i9dQZF1DXcBWIGoYBM5M'}
    

    So 26560102031 is the media reference for your playlist.

    The patent states that the code is first detected and then possibly converted into 63 bits using a Gray table. For example 361354354471425226605 is encoded into 010 101 001 010 111 110 010 111 110 110 100 001 110 011 111 011 011 101 101 000 111.

    However the code sent to the API is 6875667268, I'm unsure how the media reference is generated but this is the number used in the lookup table.

    The reference contains the integers 0-9 compared to the gray table of 0-7 implying that an algorithm using normal binary has been used. The patent talks about using a convolutional code and then the Viterbi algorithm for error correction, so this may be the output from that. Something that is impossible to recreate whithout the states I believe. However I'd be interested if you can interpret the patent any better.

    This media reference is 10 digits however others have 11 or 12.

    Here are two more examples of the raw distances, the gray table binary and then the media reference:

    1.

    022673352171662032460
        
    000 011 011 101 100 010 010 111 011 001 100 001 101 101 011 000 010 011 110 101 000
        
    67775490487
    
    574146602473467556050 
    
    111 100 110 001 110 101 101 000 011 110 100 010 110 101 100 111 111 101 000 111 000
    
    57639171874
    

    edit:

    Some extra info: There are some posts online describing how you can encode any text such as spotify:playlist:HelloWorld into a code however this no longer works.

    I also discovered through the proxy that you can use the domain to fetch the album art of a track above the code. This suggests a closer integration of Spotify's API and this scannables url than previously thought. As it not only stores the URIs and their codes but can also validate URIs and return updated album art.

    https://scannables.scdn.co/uri/800/spotify%3Atrack%3A0J8oh5MAMyUPRIgflnjwmB