Security Risks with Python Package Naming Convention: Typosquatting and Beyond

Written by:

0 mins read

One innocent pip install command can lead to complete environment security compromise, backdoor, trojans, and malicious package catastrophe, but why? Open source supply chain security often focus on package managers and package repositories.

What is the open source supply chain? The supply chain encompasses all the components, libraries, and tools contributing to a software product. For Python developers, this often includes packages sourced from the Python Package Index (PyPI). While these packages offer immense convenience and functionality, they also introduce potential security vulnerabilities. For example, malicious actors can exploit these vulnerabilities to inject harmful code into applications, leading to data breaches, unauthorized access, and other security incidents.

Typosquatting as an attack vector

One of the prominent attack vectors in the Python ecosystem is typosquatting. Often witnessed in JavaScript’s npm registry but not a stranger to the PyPI registry, this technique involves creating malicious packages with names that closely resemble legitimate ones. The attackers aim to trick developers into inadvertently installing these malicious packages by making a slight typo or misremembering the exact package name. Once installed, these packages can execute a code payload that seeks to exploit the system they’re on, such as harvesting sensitive information from the machine’s environment variables.

Python PyYAML package name confusion

Attackers often exploit the similarity in package names to distribute malicious code. Consider the popular PyYAML package. To install it, developers use the command pip install pyyaml, but in their code, they import it using import yaml. This inconsistency can lead to confusion, where a developer might mistakenly run pip install yaml, potentially installing a different, possibly malicious package.

This is also evident in the official PyYAML documentation:

FAQ: What should I do if I suspect a package I installed is malicious?

If you suspect a package is malicious, take the following steps:

Stop using the package: Immediately cease using the package in your project.
Report the package: Report the suspicious package to the repository maintainers (e.g., PyPI) for further investigation and ensure your local AppSec and TrustSec teams follow security breach procedures (resetting and rotating secrets, etc.).
Audit your system: Check your system for any unauthorized changes or suspicious activity.
Review the code: Ensure the malicious package has not compromised your codebase and remote code repository, including the remote registry and any other artifacts that store registries.
Data breach procedures: The system is likely considered compromised and should be treated as such.

More package naming confusion with pip

Another example of package confusion arises from the PURL spec and the pip package manager's handling of underscores and hyphens. For instance, executing pip install langchain-community and pip install langchain_community installs the same package despite the different strings. Attackers can exploit this behavior to create packages with subtle name variations, increasing the risk of accidental installation of malicious software.

pip install langchain-community demonstrates how underscore and hyphens are both allowed and interpreted as the same divider

Additionally, pip treats package names as case insensitive. This means pip install PackageName and pip install packagename will yield the same result. This is unlike the npmjs registry, which previously allowed case sensitivity but eventually adopted a case-insensitive approach. Attackers can leverage the lack of case distinction in pip s to obfuscate malicious packages further.

FAQ: What is typosquatting in the context of Python packages?

Typosquatting is a malicious practice where attackers create packages with names similar to popular or legitimate packages. The goal is to trick developers into installing these malicious packages by exploiting common typos or naming conventions. For example, if a developer intends to install the popular requests package but accidentally types requsets they might inadvertently install a malicious package.

Summary and further reading

Understanding the risks associated with malicious packages in Python is crucial for maintaining a secure software supply chain. To avoid being trapped into typosquatting and other naming confusions, it’s a best practice to be vigilant when installing packages and double-checking names and sources. For further reading on supply chain security risks, consider exploring dependency confusion and maintainer account takeover incidents on PyPI.

How can developers protect themselves from installing malicious Python packages?

Developers can take several precautions to protect themselves:

Double-check package names: Always verify the spelling of package names before installation.
Review package metadata: Check the metadata and documentation for inconsistencies or red flags.
Use trusted sources: Only install packages from trusted repositories like PyPI.
Implement security tools: Use Snyk Open Source to scan for vulnerabilities and malicious packages.

Here’s further guidance on maintaining secure open source practices and implementing strong supply chain security measures to protect your projects effectively:

Strengthen your security posture with Snyk

As developers and DevOps professionals, safeguarding your projects from supply chain attacks is paramount. The risks posed by malicious packages, such as those introduced through typosquatting, underscore the need for robust security measures. Signing up for Snyk can enhance your security posture and protect your projects from these threats.

Proactively manage dependencies

Use Snyk's comprehensive suite of tools to manage your dependencies proactively. Snyk provides real-time vulnerability scanning and alerts, ensuring you are aware of any risks associated with your packages. This proactive approach helps you mitigate potential threats before they impact your projects.

Leverage Snyk's integration capabilities

Snyk seamlessly integrates with your existing development workflow, allowing you to incorporate security checks into your CI/CD pipelines, Python IDE, and git SCM such as GitHub, GitLab, and BitBucket. This integration ensures that security is a continuous process rather than an afterthought.

Stay informed and educated

Have you already taken a quick Python security lesson from Snyk Learn? Snyk offers many educational resources to inform you about the latest security threats and best practices. By staying up-to-date with these resources, you can better understand the evolving landscape of application and open source security and take informed actions to protect your projects.

Play Fetch the Flag

Test your security skills in our CTF event on February 27, from 9 am - 9 pm ET.

The developer security platform

Want to try it for yourself?