Skip to main content

Python Poetry package manager and security integration with software composition analysis tool

Written by:
Abhay Bhargav
Abhay Bhargav
wordpress-sync/Blog-Headers-4

November 13, 2020

0 mins read

I have always believed that package managers can be the ultimate weapon in the fight against vulnerable dependencies. If package managers can be leveraged to scan for vulnerable dependencies, developers would be able to identify and fix vulnerabilities in their dependencies more easily and quickly, rather than letting the vulnerability snake its way into the build process.

However, for this to work, the solution has to be:

  1. simple — very easy to use

  2. integrated — should work as part of the developer’s workflow and with the package manager in question

I have recently started using the Python Poetry package management system. I really like the way it's organized and some of the ways in which it meets its stated goals — so, I was hoping for a security scanning capability that can be integrated with it.

That’s why I was happy to hear that Snyk recently introduced a new feature to support the Poetry package manager and the poetry lock syntax. To take a closer look at how this works, I signed into my newly created test Snyk account and created a new project with some intentionally vulnerable dependencies.

I entered the command to create a new project called Poetry snyk which set up a directory called “poetry snyk” and a package called poetry_snyk —basically boilerplate stuff along with a TOML package.

So, if I go to poetry_snyk and do tree, I will see a readme which has the root package name poetry_snyk. The TOML file is the descriptor for Poetry and also contains boilerplate tests — more or less a pie test that it has written.

wordpress-sync/blog-tree-poetry_snyk-code

If you look at the TOML project, we're using Python 3.7(a slightly older version of Python I must admit but nevertheless that's the Python local version that I'm using for dependency reasons). We are also using a test dev dependency called pi test—in this case, the pi test 5.2 version (3.7 is the pi core Python which also has a build that I am not using).

One thing you should realize about Poetry is that the TOML file replaces the previously required requirements.txt and setup.py files. Thus, with Poetry you can not only set up dependencies and track them, you can also build stuff and use it to publish to the Python cheese shop. t is pretty interesting to note that just one TOML file can replace the entire setup.py and requirements.txt files.

wordpress-sync/blog-poetry-python-code

One thing I like about Poetry over the previous package managers is that you can use this as an integrated solution rather than a patchwork solution.

A lot of the previous package managers that I worked with — even Pipenv felt incomplete, as it extended on top of the pip package manager codebase, more as a wrapper. I feel Poetry takes it to another level because it reframes the problem by essentially combining the best of npm with its descriptor and the lock file on one hand along with Rust’s Cargo package management system. If you look at the Cargo package manager for Rust, it gives you some documentation and tests out of the box. I think Poetry has tried to combine these two things and give you that approach.

Now, we have created a new Poetry project called poetry_snyk and set up the Poetry project but we also need to install some libraries and get it going. Since I am using it with Snyk, I want to test it with insecure libraries—typically the previously known ones—to see if Snyk is able to catch it when we are scanning for it. Snyk needs two files to be available, so let us add some dependencies to our project through the following code: 

wordpress-sync/blog-docx-project-python.x

This is a docx project python.xproject that allows you to parse and create docx files. This is vulnerable to an xml external entities’ flaw. It is a pretty bad flaw that I've demoed several times in my training sessions.

Basically, if you load a malicious word document with XML entity variable that is set to something like /etc/passwd, then the docx library resolves that entity, leading to Local File Inclusion and can possibly even be chained to create a Remote Code Execution exploit. However, I am going to install it and it is going to resolve the dependencies and the nested dependencies. It is also going to update the TOML file and add a poetry.lockfile. Now, the poetry.lock gives you all the nested dependencies for that top-level dependency.

wordpress-sync/blog-poetry.lock-nested-dependencies

You can see that it gives you all the nested dependencies that you need because all the scanning tools, especially Snyk and many other source code analysis or package analysis tools will need this to analyze the nested dependencies.

Apart from the nested dependencies, it will also give you the version of the nested dependency it is using with the hash value it is using, so that it binds it correctly. This was quite a big gap in the Python ecosystem because the pip dependency manager, which is the default Python dependency manager, did not provide all of these details—it only listed it out without precisely pinning it. Thus it did not do a very good job of managing the sub dependencies really well.

Let’s say I install another package with Poetry. I install another vulnerable Python package, in this case, pyyaml. I am going to use version 3.13 which is vulnerable to an insecure deserialization flaw. If this flaw is exploited, the attacker can do a remote code execution on the target application environment. I am going to install this and you will see that it should also update my TOML file as well as the lock file.

wordpress-sync/blog-install-pyyaml
wordpress-sync/blog-install-pyyaml-2

Now let’s monitor with Snyk by entering the following code:

wordpress-sync/blog-poetry-monitor-snyk
wordpress-sync/blog-poetry-monitor-snyk-output

Looks like it has analyzed all my dependencies. So, this is my Snyk project — or my Snyk organization — which has been generated with all my projects listed out, and you see that I've created this project.

wordpress-sync/blog-new-project-dashboard
wordpress-sync/blog-new-poetry-project-dashboard

As you can see, we have issues that have been flagged and all of those are high-severity issues — the pyyaml insecure deserialization, the same remote code execution on the pyyaml, a pretty well-known and serious exploit, and the xml external entities’ injections on the python-docx as well. Snyk has been able to detect all the vulnerabilities in the dependencies I am using in this project, thanks to the comprehensive database and the simplicity of the integration!

wordpress-sync/blog-snyk-detect-vulnerabilities-project
wordpress-sync/blog-example-vulnerability-snyk-detected

Getting started

To get started with this integration, all you need to do is to install your Poetry project and Snyk is able to track it at that point in time — it also emails notifications about these dependencies which is really cool!

 I was able to set up a Poetry project very quickly, scan it for security issues with Snyk, and get an in-depth report of the consequences.

Keep in mind that Snyk requires these two files — the lock file and TOML file both to be available for the integration to work.

wordpress-sync/blog-snyk-require-two-files-poetry