pythondockerllama-index

Can't import LlamaParse


I created a short python application in a Google Colab notebook, that works fine. I am trying to move the application from Google Colab into a local Docker-Python-Application.

But whenever I run the application with Docker and do in a .py file

from llama_parse import LlamaParse

My application fails with the following error:

pdf-compare-mvp1-web-1  |   File "/usr/local/lib/python3.8/site-packages/llama_parse/base.py", line 386, in LlamaParse
pdf-compare-mvp1-web-1  |     def get_images(self, json_result: list[dict], download_path: str) -> List[dict]:
pdf-compare-mvp1-web-1  | TypeError: 'type' object is not subscriptable

here is the functional notebook code

!pip install -q llama-index
!pip install -q openai
!pip install -q transformers
!pip install -q accelerate

import os
os.environ["OPENAI_API_KEY"] = "..."

from IPython.display import Markdown, display
from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_parse import LlamaParse

parser = LlamaParse(
    api_key="...",  
    result_type="markdown"  
)
document1 = await parser.aload_data("data/some.pdf")

# some more code

Following the not working local Docker application:

To build and run the application locally on my machine, I created Python flask app with the following relevant files:

Dockerfile

# Use an official Python runtime as a parent image
FROM python:3.8-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make port 5000 available to the world outside this container
EXPOSE 5000

# Define environment variable
ENV FLASK_APP=app.py

# Run app.py when the container launches
CMD ["flask", "run", "--host=0.0.0.0"]

docker-compose.yml

version: '3.8'
services:
  web:
    build: .
    volumes:
      - .:/app
    ports:
      - "5002:5000"
    environment:
      - FLASK_ENV=development

requirements.txt

Flask==2.1.2
python-dotenv==0.20.0
boto3==1.24.12
Werkzeug==2.2.0
PyMuPDF
llama-index
openai
llama-parse
# transformers
# accelerate

processor.py

import fitz  # PyMuPDF
import os
from config import Config

# from llama_parse import LlamaParse
# import nest_asyncio
# nest_asyncio.apply()


from llama_index.llms.openai import OpenAI
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_parse import LlamaParse

The last line from llama_parse import LlamaParse result in the error. Whenever this line is present/not present, the application is functional/ functional


Solution

  • As per this answer,

    list[dict] is supported from python 3.9 and up. You need to upgrade your python version.

    You can change FROM python:3.8-slim to FROM python:3.9-slim or any higher version in your Dockerfile