I created a short python application in a Google Colab notebook, that works fine. I am trying to move the application from Google Colab into a local Docker-Python-Application.
But whenever I run the application with Docker and do in a .py file
from llama_parse import LlamaParse
My application fails with the following error:
pdf-compare-mvp1-web-1 | File "/usr/local/lib/python3.8/site-packages/llama_parse/base.py", line 386, in LlamaParse
pdf-compare-mvp1-web-1 | def get_images(self, json_result: list[dict], download_path: str) -> List[dict]:
pdf-compare-mvp1-web-1 | TypeError: 'type' object is not subscriptable
here is the functional notebook code
!pip install -q llama-index
!pip install -q openai
!pip install -q transformers
!pip install -q accelerate
import os
os.environ["OPENAI_API_KEY"] = "..."
from IPython.display import Markdown, display
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_parse import LlamaParse
parser = LlamaParse(
api_key="...",
result_type="markdown"
)
document1 = await parser.aload_data("data/some.pdf")
# some more code
Following the not working local Docker application:
To build and run the application locally on my machine, I created Python flask app with the following relevant files:
Dockerfile
# Use an official Python runtime as a parent image
FROM python:3.8-slim
# Set the working directory in the container
WORKDIR /app
# Copy the current directory contents into the container at /app
COPY . /app
# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Make port 5000 available to the world outside this container
EXPOSE 5000
# Define environment variable
ENV FLASK_APP=app.py
# Run app.py when the container launches
CMD ["flask", "run", "--host=0.0.0.0"]
docker-compose.yml
version: '3.8'
services:
web:
build: .
volumes:
- .:/app
ports:
- "5002:5000"
environment:
- FLASK_ENV=development
requirements.txt
Flask==2.1.2
python-dotenv==0.20.0
boto3==1.24.12
Werkzeug==2.2.0
PyMuPDF
llama-index
openai
llama-parse
# transformers
# accelerate
processor.py
import fitz # PyMuPDF
import os
from config import Config
# from llama_parse import LlamaParse
# import nest_asyncio
# nest_asyncio.apply()
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_parse import LlamaParse
The last line from llama_parse import LlamaParse result in the error. Whenever this line is present/not present, the application is functional/ functional
As per this answer,
list[dict]
is supported from python 3.9 and up. You need to upgrade your python version.
You can change FROM python:3.8-slim
to FROM python:3.9-slim
or any higher version in your Dockerfile