pythondiscordpycordgenius-api

How to pick a random song from genius and send a small section of those lyrics to discord


@client.command()
async def lyrics(ctx):
    genius = lyricsgenius.Genius(api_key)
    message_ID = ctx.message.author.id
    await ctx.send("**name the song author:**")
    @client.listen("on_message")
    async def on_message(message):
        if message.author.id == int(message_ID):
            artist_name = message.content
            artist = genius.search_artist(artist_name)
            print(artist)
        else:
            return

This is what I have so far. It asks the user to say an artist's name, and it searches through the list of songs that the artist has on genius. However, this takes a lot of time. I want it to be able to randomly pick a song, then get the lyrics, and then send a small section of the lyrics.


Solution

  • I have decided to move my comments to an answer. I have installed the package and played with it a little bit, the problem you are facing here is the implementation of search_artist: indeed, it fetches full song data which is not what we want. search_songs is a more appropriate choice, and another alternative is reimplementing search_artist so that it would only return one result indeed. I will focus on the fetching lyrics logic only, without wrapping it into the Discord API.

    The starting place to investigate for me was the public Genius API. You can not expect to somehow circumvent that, the library you are using is just a wrapper around it (see e.g. this definition). There is no ready-made API for fetching a random song for a given artist on request, and for your task, one has to poll at least two API endpoints. More specifically...

    First you do GET /search for the artist query. You can not omit this step - at the beginning, you have an artist name, but not the artist ID. That will return a JSON result similar to the one below (I am using Try it! examples from the Genius API page linked above).

    hits: [ { highlights: [...],
    index: "song",
    type: "song",
    result: {
    annotation_count: 20,
    api_path: "/songs/3039923", (...)
    

    An alternative to that is calling search_artists (note the s at the end!). It can get fairly ugly, as the relevant information is nested deeply into the JSON. For example,

    genius.search_artists("Led")['sections'][0]['hits'][0]['result']['id']
    

    returns 562, which is the artist ID for Led Zeppelin. This line looks bad, but the lyricsgenius implementation is fairly similar to it anyway, albeit obviously has more precautions. At a bare minimum, it would make sense to wrap the block above in try... catch.

    If picking randomly from all songs by the top artist returned by the search is not a requirement, a more straightforward way would be to just use a bare, non-artist-specific search:

    songs = genius.search_songs("Led")['hits']
    

    This would lead to some entries by other artists (e.g. Lo & Leduc, in the above case).

    At this point, we have the artist ID and other info, but no songs yet. And this is where it gets ugly. Actually, it already did - Genius API does not, in fact, return all the song IDs in one query, and uses pagination instead. It is sensible from a design perspective, but is nightmarish to work around in your use case. The call above returns 10 entries (default), by using per_page you can get up to 50 per API call, and by providing a page offset, you could browse individual songs. Thus, you may not know in advance how many songs there will be. Another part of the problem seems to be that Genius is SSR-heavy, and takes a long time when lots of songs are being requested at once. This is fine for data scienc-y projects, but not so much for a discord bot.

    I might be blind, but I could not find a function or field in the API for the total number of songs per artist. Likely, the backend uses SELECT ... FROM ... LIMIT offset, count or some such internally. Interestingly, I also have found that kind of binary searching the number of songs for a given artist might be faster or about the same as requesting a large amount of songs at once - think

    genius.artist_songs(562, per_page=1, page=250)
    

    vs

    genius.artist_songs(562, per_page=50, page=5)
    

    Responses returned by calls above will have a ['next_page'] field set if it exists, and you are not at the end of the song list just yet. Again, this is not something you can circumvent. Now, an interesting problem would be optimizing this random search for optimal per_page counts and, possibly, forgoing the last few requests if the search step becomes too small, but an okay starting point would be something like this:

    max_page = 1000
    page = max_page
    next_page = 1
    last_known_good_page = 0
    last_known_bad_page = max_page
    while True:
        next_page = genius.artist_songs(artist_id, per_page=1, page=page)['next_page']
        if next_page:
            last_known_good_page = next_page
            page = int((last_known_bad_page + page) / 2)
        else:
            last_known_bad_page = page
            page = int((last_known_good_page + page) / 2)
        if page - last_known_good_page < 1:
            break
    

    Of course, be careful when integrating such indeterminate loops in your app. For example, if the artist has got more than max_page entries, it would not terminate at all. For artist_id = 562 it gives a correct answer of 275 pages. The code above runs about as fast as using per_page = 50 and max_page = 20. You could toy with these parameters a bit more, the logic is going to get a touch more convoluted if you are relying on multiple results per page, as you would then need to count them.

    After you have determined a number of songs, the rest is straightforward:

    random_id = genius.artist_songs(artist_id, per_page=1, page=random.randint(0, page), sort='popularity')['songs'][0]['id']
    lyrics = genius.lyrics(random_id)
    

    For random_id = 10, determined by a Fair Dice Roll™, you get this nice output:

    "When the Levee Breaks Lyrics[Intro]\n\n[Verse 1]\nIf it keeps on raining, levee's going to break\nIf it keeps on raining, the levee's going to break\n\n[Chorus 1]\nWhen the levee breaks, have no place to stay\n\n[Verse 2]\nMean old levee taught me to weep and moan, oh\nMean old levee taught me to weep and moan\n[Chorus 2]\nIt's got what it takes to make a mountain man leave his home\nOh well, oh well, oh well, ooh\n\n[Bridge 1]\nOh, don't it make you feel bad\nWhen you're trying to find your way home\nYou don't know which way to go\nIf you're going down south, they got no work to do\nIf you're going north to Chicago\nAh, ah, ah, hey\n\n[Instrumental Break]\n\n[Verse 3]\nCrying won't help you, praying won't do you no good\nNo, crying won't help you, praying won't do you no good\n\n[Chorus 3]\nWhen the levee breaks, mama, you got to move, ooh\n\n[Verse 4]\nAll last night I sat on the levee and moaned\nAll last night, sat on the levee and moaned\nThinking about my baby and my happy home\nOh-ho\nYou might also like[Bridge 2]\nAh, ah, ah, ah-ah\nAh, ah, ah, ah-ah\nOh, oh\n\n[Outro]\nGoing\nI'm going to Chicago\nGoing to Chicago\nSorry, but I can't take you, ah\nGoing down, going down now\nGoing down, I'm going down now\nGoing down, going down\nGoing down, going down\nOh...\nGoing down, going down now\nGoing down, going down now\nGoing down, going down now\nGoing down, going, dow- dow- dow- dow- down, now\nOoh, ooh34Embed"

    A word of warning: the API is not well-suited for your problem AT ALL. I believe there is no way to do what you want actually fast, probably limiting yourself to one of the top (20? Maybe 50?) results and, optionally, not using an extra query for artist ID would be the way to go for your use case. Otherwise, you will have to have users wait, and possibly cope a little bit by making the wait a little bit more fun. I have also hit request timeouts a couple of times while testing, something to keep in mind so that your bot does not outright break.

    A full run of the above code (I have limited it so to not hammer the API too much):

    7.91 s ± 848 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
    

    At least, this has produced lyrics for a truly random song for a top artist returned by the search (well, as long as they have fewer than max_page songs...).

    If one of top-50 songs is good enough, it becomes a lot easier, with much fewer requests and about twice as fast:

    artist_id = genius.search_artists("Led")['sections'][0]['hits'][0]['result']['id']
    songs = genius.artist_songs(artist_id, per_page=50, sort='popularity')['songs']
    lyrics = genius.lyrics(random.choice(songs)['id'])
    
    3.57 s ± 362 ms per loop (mean ± std. dev. of 3 runs, 1 loop each). 
    

    Try-catch blocks would further slow you down (easily over half a second in my testing). Bear in mind a random song might not have lyrics at all.

    The original implementation looping through all the songs and fetching them is still way, way slower. Even for just 10 songs:

    genius.search_artist("Led", max_songs=10)
    [SNIP]
    Done. Found 10 songs.
    22.8 s ± 987 ms per loop (mean ± std. dev. of 3 runs, 1 loop each)
    

    Further ways of improvement would likely include local caching like rabbibillclinton suggests or getting smart with finding the number of songs per artist. But this is kind of the gist of it; it would not run fast, but it could run a-okay.