Malicious Code in Open Source Libraries: From Pull Request to Payload

Malicious Code in Open Source Libraries:
From Pull Request to Payload.

Open source libraries are trusted implicitly. Attackers exploit that trust — through maintainer account takeover, slow-burn contribution history building, and changes that slip past rushed code review.

✍️ DevSecOps Team📅 April 2026⏱ 12 min read Supply ChainOpen SourceSCA

The Open Source Trust Problem

When you add a dependency, you implicitly trust every contributor who has ever committed to that project — and every maintainer who has ever merged a PR. For a popular library maintained by a single volunteer over years, that trust chain is long, distributed, and largely unaudited.

The open source security model works well for vulnerabilities that are unintentional. It breaks down for intentional malicious contributions, because the social and technical mechanisms that catch bugs (many eyes, automated testing) don't reliably catch stealthy backdoors inserted by a trusted contributor.

Maintainer Account Takeover

The fastest path to compromising a popular library is compromising the maintainer's account on npm, PyPI, or GitHub. Methods:

Credential stuffing — maintainers reuse passwords; if a breach exposes credentials for another service, the registry account may be accessible
Phishing — targeted spear phishing of maintainers is well-documented; some attacks use fake security advisory emails asking maintainers to "verify" access
Package handoff — an attacker contacts a burned-out maintainer, offers to take over maintenance of a popular but neglected package, then publishes a malicious version after gaining trust
Social engineering GitHub app authorisations — tricking maintainers into authorising a malicious GitHub App that then has write access to the repository

Most npm package accounts have no MFA. Until recently, npm did not require MFA for the top 100 packages by download count. An entire ecosystem of critical infrastructure ran on single-password accounts.

Malicious Pull Requests

Contributing to a project over time to build trust before submitting a malicious PR is a documented attack pattern. The attacker:

Makes several legitimate, high-quality contributions over weeks or months
Builds a reputation as a trusted contributor
Submits a PR that fixes a real bug or adds a real feature — but includes a small, stealthy malicious change in a part of the codebase the reviewer is less familiar with
The PR gets merged on the strength of the attacker's contribution history

The malicious change is often placed in a non-obvious location: deep in a utility module, inside a conditional that only triggers in specific environments, or in a build script rather than the main library code.

Stealthy Commit Techniques

Technical techniques for hiding malicious code in commits:

Whitespace manipulation — hiding code in trailing whitespace, tab/space differences that affect Python indentation, or non-printing characters
Test-only placement — inserting malicious code in test files or test utilities that are included in the package but appear less scrutinised
Binary file changes — modifying compiled assets, certificates, or data files that can't be diff-reviewed in a standard PR
Build script injection — modifying build scripts (Makefile, setup.py, configure.ac) that run during compilation
Merge commit hiding — exploiting GitHub's merge commit to introduce changes that weren't in the reviewed PR

The xz-utils Case Study (2024)

The xz-utils backdoor is the most sophisticated documented supply chain attack against an open source library. Key elements:

The attacker (Jia Tan) contributed to the xz project for over two years, making legitimate improvements and eventually gaining co-maintainer trust
The malicious payload was hidden across multiple commits, including in binary test files that couldn't be easily reviewed
The backdoor targeted systemd-linked builds of sshd — deliberately narrow targeting to avoid widespread detection
It was discovered by a Microsoft engineer who noticed unusual sshd CPU usage during routine benchmarking — not by any security scanner

What xz-utils reveals: A patient, sophisticated attacker can introduce a multi-year infiltration into a critical open source project, bypass all automated scanning, and get within hours of deploying an SSH backdoor to major Linux distributions. Standard CVE-based SCA would never catch this.

Detection and Defences

No single control prevents this class of attack. The defence is a layered combination:

Pin exact versions — use exact version pins in lock files, not semver ranges that automatically pull new versions
Review dependency diffs — when upgrading a dependency, review the diff of what changed in the library, not just your own code changes
Verify package signatures — sigstore and npm's provenance features allow verifying that a package was built from a specific commit by a specific workflow
Monitor for unexpected new capabilities — a library update that suddenly introduces network calls, process spawning, or file system access deserves scrutiny regardless of whether it has a CVE
Binary file scanning — scan for executables, archives, and blobs in source repositories; these are common payload staging locations
SBOM generation and monitoring — generate an SBOM on each build and alert on unexpected new transitive dependencies

Malicious Code in Open Source Libraries:From Pull Request to Payload.