pythonnlpnltkgoogle-cloud-runtextblob

Is there a way to download TextBlob corpora to Google Cloud Run?


I am using Python with TextBlob for sentiment analysis. I want to deploy my app (build in Plotly Dash) to Google Cloud Run with Google Cloud Build (without using Docker). When using locally on my virtual environment all goes fine, but after deploying it on the cloud the corpora is not downloaded. Looking at the requriements.txt file, there was also no reference to this corpora.

I have tried to add python -m textblob.download_corpora to my requriements.txt file but it doesn't download when I deploy it. I have also tried to add

import textblob
import subprocess
cmd = ['python','-m','textblob.download_corpora']
subprocess.run(cmd)

and

import nltk
nltk.download('movie_reviews')

to my script (callbacks.py, I am using Plotly Dash to make my app), all without success.

Is there a way to add this corpus to my requirements.txt file? Or is there another workaround to download this corpus? How can I fix this?

Thanks in advance!

Vijay


Solution

  • Since Cloud Run creates and destroys containers as needed for your traffic levels you'll want to embed your corpora in the pre-built container to ensure a fast cold start time (instead of downloading it when the container starts)

    The easiest way to do this is add another line inside of a docker file that downloads and installs the corpora at build time like so:

    RUN python -m textblob.download_corpora 
    

    Here's a full docker file for your reference:

    # Python image to use.
    FROM python:3.8
    
    # Set the working directory to /app
    WORKDIR /app
    
    # copy the requirements file used for dependencies
    COPY requirements.txt .
    
    # Install any needed packages specified in requirements.txt
    RUN pip install --trusted-host pypi.python.org -r requirements.txt
    RUN python -m textblob.download_corpora
    
    # Copy the rest of the working directory contents into the container at /app
    COPY . .
    
    # Run app.py when the container launches
    ENTRYPOINT ["python", "app.py"]
    

    Good luck, Josh