pythondatasetgoogle-colaboratoryspotipy

How could I create a dataset with data about music such as songs, albums, artists, and their meta data such as when they were published/established?


I am creating an app that will need to access a very large dataset. The dataset will need to have data about music, films, locations, etc.

I have started with populating my database with music data and I found the spotipy API. Does anyone know if the spotipy API would be a good method for populating my dataset with music? I have experimented with it and found that the search method has a limit of 50 results, so that does not help my situation.

I found in the examples in their documentation you can do this to retrieve all albums of a specific artist:

birdy_uri = 'spotify:artist:2WX2uTcsvV5OnS0inACecP'
spotify = spotipy.Spotify(client_credentials_manager=CLIENT_ID)

results = sp.artist_albums(birdy_uri, album_type='album')
albums = results['items']
while results['next']:
    results = spotify.next(results)
    albums.extend(results['items'])

for album in albums:
    print(album['name'])

I got this working and it prints this:

Young Heart
Beautiful Lies
Beautiful Lies
Beautiful Lies (Deluxe)
Beautiful Lies (Deluxe)
Fire Within
Fire Within
Fire Within (Deluxe)
Fire Within (Deluxe)
Fire Within (Deluxe)
Live in London
Birdy
Birdy
Birdy
Birdy
Birdy (Deluxe Version)

But when I try the same thing with a different artist, for example led zeppelin like this:

ledZep_uri = "spotify:artist:36QJpDe2go2KgaRleHCDTp"
spotify = spotipy.Spotify(client_credentials_manager=CLIENT_ID)

results = sp.artist_albums(ledZep_uri , album_type='album')
albums = results['items']

while results['next']:
    results = spotify.next(results)
    albums.extend(results['items'])

for album in albums:
    print(album['name'])

I get a traceback error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-451d92017532> in <module>
      6 
      7 while results['next']:
----> 8     results = spotify.next(results)
      9     albums.extend(results['items'])
     10 

3 frames
/usr/local/lib/python3.9/dist-packages/spotipy/client.py in _auth_headers(self)
    234             return {}
    235         try:
--> 236             token = self.auth_manager.get_access_token(as_dict=False)
    237         except TypeError:
    238             token = self.auth_manager.get_access_token()

AttributeError: 'str' object has no attribute 'get_access_token'

I understand it has a problem with my access token but I don't understand why it didn't throw error with the original birdy example. I would imagine if there's a problem with my access token, I wouldn't have been able to do either.

Anyways, I am curious if spotipy would work for populating my database? If anyone has experience with spotipy, or knows a different method of populating my database (maybe web scraping?) I would really appreciate some insight here.

Thanks


Solution

  • In your code, Two items were wrong.

    #1 Credential Setting : detail information in here

    From

    spotify = spotipy.Spotify(client_credentials_manager=CLIENT_ID)
    

    To

    auth_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
    spotify = spotipy.Spotify(auth_manager=auth_manager)
    

    #2 spotify variable

    From

    results = sp.artist_albums(ledZep_uri , album_type='album')
    

    To

    results = spotify.artist_albums(ledZep_uri , album_type='album')
    

    Because you did create spotify = spotipy.Spotify()

    so API calling by spotify.artist_albums() is correct not.sp.artist_albums()

    This is demo code

    import spotipy
    from spotipy.oauth2 import SpotifyClientCredentials
    
    CLIENT_ID='<your client ID>'
    CLIENT_SECRET='<your client Secret>'
    
    auth_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
    spotify  = spotipy.Spotify(auth_manager=auth_manager)
    
    artist_uri = 'spotify:artist:2WX2uTcsvV5OnS0inACecP' # Birdy
    # artist_uri = 'spotify:artist:36QJpDe2go2KgaRleHCDTp' # ledZep
    
    results = spotify.artist_albums(artist_uri, album_type='album')
    albums = results['items']
    artist_name = results['items'][0]['artists'][0]['name']
    while results:
        results = spotify.next(results)
        if not results is None:
            albums.extend(results['items'])
            if results['next']:
                results = spotify.next(results)
            else:
                results = None
    
    print('-------------------------------------------------------------------------------------------------------------')
    print('{:40s} : {:12s} : {:12s} : {:10s}'.format(artist_name +' '+'Album Name' , 'Release Date', 'Total Tracks', 'Album URI'))
    print('-------------------------------------------------------------------------------------------------------------')
    for album in albums:
        print('{:40s} : {:12s} : {:12d} : {:10s}'.format(album['name'] , album['release_date'], album['total_tracks'], album['uri']))
    
    print('Total albums Number is {:3d}'.format(len(albums)))
    

    Result for Birdy

    d:\temp\361>python get-album.py
    -------------------------------------------------------------------------------------------------------------
    Birdy Album Name                         : Release Date : Total Tracks : Album URI
    -------------------------------------------------------------------------------------------------------------
    Young Heart                              : 2021-04-30   :           16 : spotify:album:4qsLVZk1UnizpQJBkbFNdx
    Beautiful Lies                           : 2016-03-25   :           14 : spotify:album:1UVggFtdVPqHy5WamYFu6w
    Beautiful Lies                           : 2016-03-25   :           14 : spotify:album:5wNnopxjgSKVvHTIcBpV8Q
    Beautiful Lies (Deluxe)                  : 2016-03-25   :           19 : spotify:album:5TxrdDAUkipqcb4zGnnwkz
    Beautiful Lies (Deluxe)                  : 2016-03-25   :           19 : spotify:album:2uNFpEVey5RsxzTdoDmjiz
    Fire Within                              : 2013-09-16   :           13 : spotify:album:0r94AFhRLvpfXvha7vx2dK
    Fire Within                              : 2013-09-16   :           11 : spotify:album:1JCe9MAwb1aE01UoAwCnOM
    Fire Within (Deluxe)                     : 2013-09-16   :           17 : spotify:album:7xvS6C6kW205wbu0fjwqyu
    Fire Within (Deluxe)                     : 2013-09-16   :           16 : spotify:album:24f6ycLAjcI8rNYfV6WZvS
    Fire Within (Deluxe)                     : 2013-09-16   :           15 : spotify:album:6ig2k0oiH2AXm8MugikskB
    Live in London                           : 2012         :            8 : spotify:album:55BQeeBdoCapsI5SZFA3IN
    Birdy                                    : 2011-11-19   :           11 : spotify:album:1RmWIBEicywaqVL5re4qbI
    Birdy                                    : 2011-11-07   :           13 : spotify:album:7j7ykLBxerILBLBc8AICJS
    Birdy                                    : 2011-11-07   :           12 : spotify:album:2dpWqqBl9Faf0Bfo8q4F5u
    Birdy                                    : 2011-11-04   :           11 : spotify:album:1WGjSVIw0TVfbp5KrOFiP0
    Birdy (Deluxe Version)                   : 2011-11-04   :           14 : spotify:album:3sGzkurA1fvEFqh73sWCVA
    Total albums Number is  16
    

    Result for ledZep by enable line 11 and commented out line 10

    d:\temp\361>python get-album.py
    -------------------------------------------------------------------------------------------------------------
    Led Zeppelin Album Name                  : Release Date : Total Tracks : Album URI
    -------------------------------------------------------------------------------------------------------------
    The Complete BBC Sessions (Remastered)   : 2016-09-16   :           33 : spotify:album:6VH2op0GKIl3WNTbZmmcmI
    Physical Graffiti (Deluxe Edition)       : 2015-02-24   :           22 : spotify:album:23FJTTzUIUjhmimOE2CTX2
    Celebration Day                          : 2012-11-19   :           16 : spotify:album:0kTe1sQd9yhDsdG2Zth7X6
    Mothership (Remastered)                  : 2007         :           24 : spotify:album:4wExFfncaUIqSgoxnqa3Eh
    How the West Was Won (Remaster)          : 2003-05-27   :           18 : spotify:album:3otvl9PN3kOgk5uwAh1CBL
    Coda (Deluxe Edition)                    : 1982-11-19   :           23 : spotify:album:56G9UnPmRifbubzPDJfAyd
    Coda (Remaster)                          : 1982-11-19   :            8 : spotify:album:228mANuRrV20jS5DCA0eER
    In Through the out Door (Deluxe Edition) : 1979-08-15   :           14 : spotify:album:1jCYuXr0Ujke24z1ymBr5U
    In Through the out Door (Remaster)       : 1979-08-15   :            7 : spotify:album:1W5CtQ7Ng0kP3lXyz7PIT2
    The Song Remains the Same (Remaster)     : 1976-10-22   :           15 : spotify:album:0ui4S0TZghkf1d1Wz0oWpk
    Presence (Deluxe Edition)                : 1976-03-31   :           12 : spotify:album:6vSiY2OVanKKforfEOT2PD
    Presence (Remaster)                      : 1976-03-31   :            7 : spotify:album:3uhD8hNpb0m3iIZ18RHH5u
    Physical Graffiti (1994 Remaster)        : 1975-02-24   :           15 : spotify:album:0ovKDDAHiTwg4AEjKdgdWo
    Physical Graffiti (Remaster)             : 1975-02-24   :           15 : spotify:album:4Q7cPyiP8cMIlUEHAqeYfd
    Physical Graffiti (Deluxe Edition)       : 1975-02-24   :           22 : spotify:album:26tH0kjUhkxBEd3ipGkx3Y
    Houses of the Holy (Deluxe Edition)      : 1973-03-28   :           15 : spotify:album:7gS8ozSkvPW3VBPLnXOZ7S
    Houses of the Holy (Remaster)            : 1973-03-28   :            8 : spotify:album:0GqpoHJREPp0iuXK3HzrHk
    Led Zeppelin IV (Deluxe Edition)         : 1971-11-08   :           16 : spotify:album:44Ig8dzqOkvkGDzaUof9lK
    Led Zeppelin IV (Remaster)               : 1971-11-08   :            8 : spotify:album:5EyIDBAqhnlkAHqvPRwdbX
    Led Zeppelin III (Deluxe Edition)        : 1970-10-05   :           19 : spotify:album:3EaQYGDFE96xbOPxkJNXfX
    Led Zeppelin III (Deluxe Edition)        : 1970-10-05   :           19 : spotify:album:4xGEiQ7La4japmGrREeLlw
    Led Zeppelin III (Remaster)              : 1970         :           10 : spotify:album:6P5QHz4XtxOmS5EuiGIPut
    Led Zeppelin II (Deluxe Edition)         : 1969-10-22   :           17 : spotify:album:58N1RPC3B4mRkjBaug4u3X
    Led Zeppelin II (Deluxe Edition)         : 1969-10-22   :           17 : spotify:album:0kQ7ZEH940VZXAfJD4xh2L
    Led Zeppelin II (Remaster)               : 1969-10-22   :            9 : spotify:album:58MQ0PLijVHePUonQlK76Y
    Led Zeppelin II (1994 Remaster)          : 1969-10-22   :            9 : spotify:album:70lQYZtypdCALtFVlQAcvx
    Led Zeppelin (Deluxe Edition)            : 1969-01-12   :           17 : spotify:album:5HNlYbQp7wKbKscWy7ceMp
    Led Zeppelin (Deluxe Edition)            : 1969-01-12   :           17 : spotify:album:22BzOOZKYZ2jYYKLpOlnET
    Led Zeppelin (Remaster)                  : 1969-01-12   :            9 : spotify:album:1J8QW9qsMLx3staWaHpQmU
    Led Zeppelin                             : 1969-01-12   :            9 : spotify:album:3ycjBixZf7S3WpC5WZhhUK
    Total albums Number is  30