I am using Python with TextBlob for sentiment analysis. I want to deploy my app (build in Plotly Dash) to Google Cloud Run with Google Cloud Build (without using Docker). When using locally on my virtual environment all goes fine, but after deploying it on the cloud the corpora is not downloaded. Looking at the requriements.txt file, there was also no reference to this corpora.
I have tried to add python -m textblob.download_corpora
to my requriements.txt file but it doesn't download when I deploy it. I have also tried to add
import textblob
import subprocess
cmd = ['python','-m','textblob.download_corpora']
subprocess.run(cmd)
and
import nltk
nltk.download('movie_reviews')
to my script (callbacks.py, I am using Plotly Dash to make my app), all without success.
Is there a way to add this corpus to my requirements.txt file? Or is there another workaround to download this corpus? How can I fix this?
Thanks in advance!
Vijay
Since Cloud Run creates and destroys containers as needed for your traffic levels you'll want to embed your corpora in the pre-built container to ensure a fast cold start time (instead of downloading it when the container starts)
The easiest way to do this is add another line inside of a docker file that downloads and installs the corpora at build time like so:
RUN python -m textblob.download_corpora
Here's a full docker file for your reference:
# Python image to use.
FROM python:3.8
# Set the working directory to /app
WORKDIR /app
# copy the requirements file used for dependencies
COPY requirements.txt .
# Install any needed packages specified in requirements.txt
RUN pip install --trusted-host pypi.python.org -r requirements.txt
RUN python -m textblob.download_corpora
# Copy the rest of the working directory contents into the container at /app
COPY . .
# Run app.py when the container launches
ENTRYPOINT ["python", "app.py"]
Good luck, Josh