I have a Python project running in a docker container, but I can't get convert_from_path
to work (from pdf2image
library). It works locally on my Windows PC, but not in the linux-based docker container.
The error I get each time is Unable to get page count. Is poppler installed and in PATH?
Relevant parts of my code look like this
from pdf2image import convert_from_path
import os
from sys import exit
def my_function(file_source_path):
try:
pages = convert_from_path(file_source_path, 600, poppler_path=os.environ.get('POPPLER_PATH'))
except Exception as e:
print('Fail 1')
print(e)
try:
pages = convert_from_path(file_source_path, 600)
except Exception as e:
print('Fail 2')
print(e)
try:
pages = convert_from_path(file_source_path, 600, poppler_path=r'\usr\local\bin')
except Exception as e:
print('Fail 3')
print(e)
print(os.environ)
exit('Exiting script')
In attempt 1 I try to reference the original file saved on windows. Basically the path refers to '/code/poppler'
which is a binded mount referring to
[snippet from docker-compose.yml]
- type: bind
source: "C:/Program Files/poppler-0.68.0/bin"
target: /code/poppler
In attempt 2 I just try to leave the path empty. In attempt 3 I tried something I found that worked from some other users locally.
Relevant parts of my Dockerfile look like this
FROM python:3.10
WORKDIR /code
# install poppler
RUN apt-get update
RUN apt-get install poppler-utils -y
COPY ./requirements.txt ./
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["python", "./app.py"]
So the issue was that my Docker image was not refreshing correctly and after nuking the build-cache and trying again the middle option worked combined with the above Dockerfile.
So a combination of RUN apt-get install poppler-utils -y
in the Dockerfile + not referencing the path in the code pages = convert_from_path(file_source_path, 600)
will work, as it will find the PATH
automatically when installing poppler-utils
.
The binded mount can also be removed from docker-compose.yml
and from the .env
file.