python pandas matplotlib plot word-cloud

'ImageDraw' object has no attribute 'textbbox'

I am working on a simple text mining project. When I tried to create a word-cloud I got this error:

AttributeError: 'ImageDraw' object has no attribute 'textbbox'

I have a dataset of News and their categories; to create a word-cloud I tried to preprocessing the text:


import pandas as pd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from textblob import Word
from wordcloud import WordCloud 

newsData = pd.read_csv("data.txt", sep= '\t', header=None, 
                       names=["Description", "Category", "Tags"],on_bad_lines='skip', 
                       engine='python' , encoding='utf-8')
#print(newsData.head())

newsData['Description'] =  newsData['Description'].apply(lambda x:  " ".join(x.lower() for x in x.split()))
newsData['Category'] =  newsData['Category'].apply(lambda x:  " ".join(x.lower() for x in x.split()))
newsData['Tags'] =  newsData['Tags'].apply(lambda x:  " ".join(x.lower() for x in x.split()))

# stopword filtering
stop = stopwords.words('english')
newsData['Description'] =  newsData['Description'].apply(lambda x: " ".join (x for x in x.split() if x not in stop))
#stemming

st = PorterStemmer()
newsData['Description'] =  newsData['Description'].apply(lambda x: " ".join ([st.stem(word) for word in x.split()]))
newsData['Category'] =  newsData['Category'].apply(lambda x: " ".join ([st.stem(word) for word in x.split()]))
newsData['Tags'] =  newsData['Tags'].apply(lambda x: " ".join ([st.stem(word) for word in x.split()]))

#lemmatize

newsData['Description'] =  newsData['Description'].apply(lambda x: " ".join ([Word(word).lemmatize() for word in x.split()]))
newsData['Category'] =  newsData['Category'].apply(lambda x: " ".join ([Word(word).lemmatize() for word in x.split()]))
newsData['Tags'] =  newsData['Tags'].apply(lambda x: " ".join ([Word(word).lemmatize() for word in x.split()]))
#print(newsData.head())


culture = newsData[newsData['Category'] == 'culture'].sample(n=200)
health = newsData[newsData['Category'] == 'health'].sample(n=200)
dataSample = pd.concat([culture, health],axis=0)

culturesmpl = culture[culture['Category'] == 'culture'].sample(n=200)
healthspml = health[health['Category'] == 'health'].sample(n=200)
#print(dataSample.head())

cultureSTR = culturesmpl.Description.str.cat()
healthSTR = healthspml.Description.str.cat()
#print(spam_str)

and then I tried to create wordcloud using WordCloud library

wordcloud_culture =  WordCloud(collocations= False, background_color='white' ).generate(cultureSTR)

# Plot
plt.imshow(wordcloud_culture, interpolation='bilinear')
plt.axis('off')
plt.show()

but after running this code I got the error:

  File ~/anaconda3/lib/python3.9/site-packages/wordcloud/wordcloud.py:508 in generate_from_frequencies
    box_size = draw.textbbox((0, 0), word, font=transposed_font, anchor="lt")

AttributeError: 'ImageDraw' object has no attribute 'textbbox'

do you know how can I fix this?

Solution

History

The ImageDraw.textsize() method was deprecated in PIL version 9.2.0 and completely removed beginning with version 10.0.0 on 2023-07-01.

The ImageDraw.textbbox() method was introduced in version 8.0.0 as a more robust solution.

Example

If you are looking to simply replace one line of code, and you previously had

text_width, text_height = ImageDraw.Draw(image).textsize(your_text, font=your_font)

..then you could instead use

_, _, text_width, text_height = ImageDraw.Draw(image).textbbox((0, 0), your_text, font=your_font)

Explanation

textsize() outputs dimensions for the nominal width and height of the text as a tuple: (width, height). textbbox() outputs the x and y extents of the bounding box as a tuple: (left, top, right, bottom).

Starting the line with _, _, is a way to discard the first two elements of the output tuple.

Adding (0, 0) as the first argument in textbbox() tells it to anchor the bounding box at the origin.

Avoid relying on outdated libraries, and explore reasons for this change and why textbbox() is a more robust method!