pythoncolorscolormapword-cloud

Mapping wordcloud color to a value for sentiment analysis


So I'm looking to see if there is a way to map the color of a word cloud to a value, or maybe even overlap two word clouds (one positive and one negative list) with the end result being a dark color for negative sentiment and a bright color for a positive sentiment like in the picture only this is random.

I'm not sure how you would assign the value because from what I see you either paste text or text with a frequency value but maybe the latter of two maps overlapped?

I was able to change the color to greens here by copying code from StackOverflow and adjusting the values. As far as I can tell it's just randomly assigned. I pasted that code just below

def green_color_func(word, font_size, position,orientation,random_state=None, **kwargs):
    return("hsl(100,100%%, %d%%)" % np.random.randint(1,51))
wordCloud.recolor(color_func = green_color_func)

below is a sample of my simple code, for clarity I removed stopwords, font path etc, mask and b3 is a dictionary of hashtags and a count frequency...maybe somehow with a colorfunc or recolor(self[, random_state, color_func, …])?...This is the cloud I have right now

wordCloud = WordCloud(font_path = font_path,width=1000, height=800,max_words=100,
                      random_state=21, background_color = 'white',
                      prefer_horizontal=1).generate_from_frequencies(b3)




plt.figure(figsize=(30,15))
plt.imshow(wordCloud, interpolation = 'bilinear')
plt.axis('off')
plt.show() 

wordCloud.to_file("bt3.png")

I'm fairly new to coding and Im stumped by this so I appreciate any insight, thanks.


Solution

  • I know this was asked a while back - and may no longer be relevant.
    However, if someone is looking to achieve the same thing, then this is the way to do it.
    Assuming you have a

    in a dataframe like this:

    import pandas as pd
    
    # Create a dataframe with the sentimental word list
    df = pd.read_csv('https://raw.githubusercontent.com/text-machine-lab/sentimental/master/sentimental/word_list/afinn.csv')
    
    # Generate the random frequency for each word, however, set the seed for reproducibility
    rs = np.random.seed(42)
    df['freq'] = np.random.randint(1, 100, df.shape[0])
    
    print(df)
    

    OUTPUT:

               word  score  freq
    0       abandon     -2    52
    1     abandoned     -2    93
    2      abandons     -2    15
    3      abducted     -2    72
    4     abduction     -2    61
    ...         ...    ...   ...
    2457      yucky     -2    77
    2458      yummy      3    57
    2459     zealot     -2    25
    2460    zealots     -2     2
    2461    zealous      2    64
    

    You could create a word cloud with colours mapped to their sentiments using the following code:

    Note:

    from wordcloud import WordCloud
    import matplotlib.pyplot as plt
    import pandas as pd
    import numpy as np
    
    class SimpleGroupedColorFunc(object):
        def __init__(self, color_to_words, default_color):
            self.word_to_color = {word: color
                                  for (color, words) in color_to_words.items()
                                  for word in words}
    
            self.default_color = default_color
    
        def __call__(self, word, **kwargs):
            return self.word_to_color.get(word, self.default_color)
    
    
    
    # Create a dataframe with the sentimental word list
    df = pd.read_csv('https://raw.githubusercontent.com/text-machine-lab/sentimental/master/sentimental/word_list/afinn.csv')
    
    # Generate the random frequency for each word, however, set the seed for reproducibility
    rs = np.random.seed(42)
    df['freq'] = np.random.randint(1, 100, df.shape[0])
    
    # Generate a word-frequency dictionary to import to the word cloud
    text_freq_dict = dict(zip(df.word, df.freq))
    
    # Generate word cloud from word-frequency dictionary
    wordcloud = WordCloud()
    wordcloud.generate_from_frequencies(frequencies=text_freq_dict)
    
    pos_words = []
    neg_words = []
    
    # Categorize the words based on sentiment score, assign to appropriate list 
    for word, sentiment in zip(df['word'], df['score']):
        if sentiment > 1:
            pos_words.append(word)
        elif sentiment < -1:
            neg_words.append(word)
    
    
    # Create a dictionary containing the colours to be used for each sentiment list
    color_to_words = {
        # green colour for positive words
        '#00ff00': pos_words,
        # red for negative words
        'red': neg_words
    }
    
    # Neutral words will be grey (with sentiment between -1 and 1 inclusive)
    default_color = 'grey'
    
    # Create a color function with simple groupded tones
    grouped_color_func = SimpleGroupedColorFunc(color_to_words, default_color)
    
    # Apply our color function to the word cloud
    wordcloud.recolor(color_func=grouped_color_func)
    
    # Plot the word cloud
    plt.figure()
    plt.imshow(wordcloud, interpolation="bilinear")
    plt.axis("off")
    plt.show()
    

    OUTPUT:

    Word cloud

    Please note that some parts of the code above were borrowed from the wordcloud package site.