pipamazon-emrgeopandas

Amazon EMR: No matching distribution found for geopandas==0.14.0


I am trying to start a Amazon EMR 6.14.0 cluster.

Here is my bootstrap script set_up.sh:

#!/usr/bin/env bash
set -e

python3 -m pip install geopandas==0.14.0

However, the Amazon EMR cluster failed to start with error in the log

ERROR: Could not find a version that satisfies the requirement geopandas==0.14.0 (from versions: 0.1.0.dev-a7b594e, 0.1.0.dev-a91e8ab, 0.1.0.dev-abb4137, 0.1.0.dev-c770a5c, 0.1.0.dev-fea5c40, 0.1.0.dev-1edddad, 0.1.0.dev-2be7de8, 0.1.0.dev-2d642e0, 0.1.0.dev-3ebcae0, 0.1.0.dev-6c816f0, 0.1.0.dev-40ec104, 0.1.0.dev-53f669a, 0.1.0.dev-120d5ee, 0.1.0.dev-189a35c, 0.1.0.dev-7788a17, 0.1.0.dev-8144a45, 0.1.0.dev-120828c, 0.1.dev0, 0.1.0, 0.1.1, 0.2, 0.2.1, 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.5.1, 0.6.0rc1, 0.6.0, 0.6.1, 0.6.2, 0.6.3, 0.7.0, 0.8.0, 0.8.1, 0.8.2, 0.9.0, 0.10.0, 0.10.1, 0.10.2)
ERROR: No matching distribution found for geopandas==0.14.0

geopandas does have 0.14.0 version at https://pypi.org/project/geopandas/0.14.0/

Any ideas? Thanks!


Solution

  • It turns out Amazon EMR 6.14.0 is using Python 3.7 by default. I need install a separate new Python version.

    Based on Custom Python3 Version on EMR, after changing the bootstrap script set_up.sh to

    #!/usr/bin/env bash
    set -e
    
    # Install New Python
    PYTHON_VERSION=3.11.6
    
    # Replace old OpenSSL and add build utilities
    sudo yum -y remove openssl-devel* && \
    sudo yum -y install gcc openssl11-devel bzip2-devel libffi-devel tar gzip wget make expat-devel
    
    # Install Python
    wget https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tgz
    tar xzvf Python-${PYTHON_VERSION}.tgz
    cd Python-${PYTHON_VERSION}
    
    # We aim for similar `CONFIG_ARGS` that AL2 Python is built with
    ./configure --enable-loadable-sqlite-extensions --with-dtrace --with-lto --enable-optimizations --with-system-expat \
        --prefix=/usr/local/python${PYTHON_VERSION}
    
    # Install into /usr/local/python3.11.x
    # Note that "make install" links /usr/local/python3.11.3/bin/python3 while "altinstall" does not
    sudo make altinstall
    
    sudo /usr/local/python${PYTHON_VERSION}/bin/python3.11 -m pip install --upgrade pip
    
    # Install geopandas 0.14.0
    /usr/local/python${PYTHON_VERSION}/bin/python3.11 -m pip install geopandas==0.14.0
    

    I can install geopandas 0.14.0 and start the Amazon EMR cluster successfully.