I have difficulties from this: (aws-lambda-python-alpha): Failed to install numpy 2.3.0 with Python 3.11 or lower
My Dockerfile:
FROM public.ecr.aws/lambda/python:3.11
# Install
RUN pip install 'numpy<2.3.0'
RUN pip install 'pyarrow[s3]'
The pyarrow package still fails on
Collecting numpy>=1.25
Downloading numpy-2.3.2.tar.gz (20.5 MB)
...
I wanted to force pyarrow to use numpy==2.2.1 but I don't see how from here. Do I need to lower the version of pyarrow?
If you read your build log, you can see that it is attempting to build pyarrow from source, and failing because it has no C compiler installed.
But why is trying to build from source, rather than installing a pre-built wheel? The Lambda image you are using uses glibc 2.26. In order to install a pre-built wheel for pyarrow 21.0.0, you'd need glibc >= 2.28.
This gives you a few ways you could solve this.
This works, because this version of pyarrow includes a build for very old glibc.
FROM public.ecr.aws/lambda/python:3.11
# Install
RUN pip install 'numpy<2.3.0'
RUN pip install 'pyarrow[s3]==20.0.0'
The Python 3.12 image uses glibc 2.34, so it is compatible with recent versions of pyarrow.
FROM public.ecr.aws/lambda/python:3.12
# Install
RUN pip install 'numpy<2.3.0'
RUN pip install 'pyarrow[s3]==21.0.0'
Both of the previous solutions require changing the version of Python or PyArrow. What if that's a nonstarter?
In theory, you could build a wheel file compatible with your version of glibc by building pyarrow from source using an image based off of glibc 2.26 or older. Then, you could copy that wheel into your lambda image, and install it. A guide on building pyarrow can be found here.