pythonflaskaws-lambdazappa

AWS Lambda - removing Python packages to speed up deployment and execution


Working to update an AWS Lambda function that relies on Flask / zappa that was originally written by another programmer. If Python modules are not imported / used by the scripts that are running in Lambda, can I remove them without any problem from requirements.txt?

The example file provided by AWS seems to have very few requirements. I just want to make sure that Cloudwatch continues to work and I'm not deleting things that AWS depends on implicitly.

Some of the packages that I am considering removing (since not imported by the Python scripts) include:

i.e. removing pyarrow and scikit-learn cut the redeploy time down from 3 minutes to 2 minutes. AWS Lambda also uses less RAM over shorter duration to execute.


Solution

  • It's hard to say as the dependencies vary across applications.

    As far as I know, scikit-learn is a huge library and is not used by AWS. However, if your application or another package in your application uses it as a dependency, removing it might break your application.

    Similarly pyarrow is also used by many packages. Some of the packages of your application could be using it internally.

    I'm not sure but boto3 can also be removed as it's always available by default by AWS. But you might want to keep it since you'll be needing it for running the application locally.

    For all other packages, I suggest you create a dependency tree of your packages using pipdeptree This will serve as starting point to determine which can could be removed.