pythongoogle-cloud-dataflowapache-beamsnyk

Snyk reporting vulnerabilities in Apache-Beam 2.52.0


We have Snyk integrated with our Python repository to identify any vulnerabilities in any of the libraries we are using. I am trying to add the dependency of apache-beam 2.52.0 (latest version) to pyproject.toml file. However, Synk is reporting a vulnerability during the build process in pyarrow 11.0.0 which Apache beam uses internally. This is also causing the build to fail.

Pin pyarrow@11.0.0 to pyarrow@14.0.1 to fix
  ✗ Deserialization of Untrusted Data (new) [Critical Severity][https://security.snyk.io/vuln/SNYK-PYTHON-PYARROW-6052811] in pyarrow@11.0.0
    introduced by apache-beam@2.52.0 > pyarrow@11.0.0

I tried going to back to Apache beam 2.44.0 which uses pyarrow 9 internally but same vulnerability is being reported with all the versions. Is there any workaround for this? (I might not be able to disable Synk or add any exclusions)


Solution

  • This should already be handled by https://github.com/apache/beam/issues/29392 with Beam 2.52.0 as long as pyarrow_hotfix is installed. You can ignore Synk.