I have a pandas dataframe which consists of grade points of students. I want to generate the word cloud or number cloud for the grades. Is there any way to achieve it. I tried all possible ways but all my efforts in vain. Basically what I want is word cloud that contains numbers in it. from the column CGPA.
Here is what I tried :
import pandas as pd
from wordcloud import WordCloud
import matplotlib.pyplot as plt
df = pd.read_csv("VTU_marks.csv")
# rounding off
df = df[df['CGPA'].isnull() == False]
df['CGPA'] = df['CGPA'].round(decimals=2)
wordcloud = WordCloud(max_font_size=50,max_words=100,background_color="white").generate(string)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
But I am getting an error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-47-29ec36ebbb1e> in <module>()
----> 1 wordcloud = WordCloud(max_font_size=50, max_words=100, background_color="white").generate(string)
2 plt.figure()
3 plt.imshow(wordcloud, interpolation="bilinear")
4 plt.axis("off")
5 plt.show()
/usr/local/lib/python3.6/dist-packages/wordcloud/wordcloud.py in generate(self, text)
603 self
604 """
--> 605 return self.generate_from_text(text)
606
607 def _check_generated(self):
/usr/local/lib/python3.6/dist-packages/wordcloud/wordcloud.py in generate_from_text(self, text)
585 """
586 words = self.process_text(text)
--> 587 self.generate_from_frequencies(words)
588 return self
589
/usr/local/lib/python3.6/dist-packages/wordcloud/wordcloud.py in generate_from_frequencies(self, frequencies, max_font_size)
381 if len(frequencies) <= 0:
382 raise ValueError("We need at least 1 word to plot a word cloud, "
--> 383 "got %d." % len(frequencies))
384 frequencies = frequencies[:self.max_words]
385
ValueError: We need at least 1 word to plot a word cloud, got 0.
You can find the data here.
After setting up your data and rounding as desired we can count up the frequency of each score:
counts = df['CGPA'].value_counts()
We need to make sure that the indices here are strings, floats will raise an error (this is what was wrong in your example attempt). So, we can convert them to strings as:
counts.index = counts.index.map(str)
#Below alternative works for pandas versions >= 0.19.0
#counts.index = counts.index.astype(str)
We can then use the .generate_from_frequencies
method to get what you desire:
wordcloud = WordCloud().generate_from_frequencies(counts)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()
This gave me the following:
Full MWE:
import pandas as pd
from wordcloud import WordCloud
import matplotlib.pyplot as plt
df = pd.read_csv("VTU_marks.csv")
# rounding off
df = df[df['CGPA'].isnull() == False]
df['CGPA'] = df['CGPA'].round(decimals=2)
counts = df['CGPA'].value_counts()
counts.index = counts.index.map(str)
#counts.index = counts.index.astype(str)
wordcloud = WordCloud().generate_from_frequencies(counts)
plt.figure()
plt.imshow(wordcloud, interpolation="bilinear")
plt.axis("off")
plt.show()