How Typosquatting Works

Package typosquatting exploits the way developers install dependencies: they type (or copy) a package name, and the package manager fetches whatever matches that name from the registry. There is no built-in verification that the package you installed is the package you intended.

The attack model is passive: register plausible typos of popular package names, publish packages with malicious payloads, and wait. Developer machines and CI environments continuously install packages. Statistical probability does the rest β€” with millions of installs per day, even a 0.001% mistype rate generates thousands of hits.

Name Mutation Techniques

Attackers use a range of mutation strategies to generate plausible typos:

  • Letter transposition β€” lodash β†’ lodahs, loadsh
  • Missing letters β€” express β†’ expres, exprss
  • Extra letters β€” axios β†’ axxios, axious
  • Separator confusion β€” cross-env β†’ crossenv, cross_env
  • Homophone substitution β€” colors β†’ colours (legitimate but sometimes used to stage forks)
  • Prefix/suffix addition β€” lodash β†’ lodash-utils, node-lodash
  • Version-appended names β€” react-v18, webpack5 targeting developers who add version numbers

Across npm, PyPI, and RubyGems

npm

npm is the highest-volume target. Notable historical examples: crossenv (9 million installs before detection), babelcli, loadash. npm's scoped packages (@scope/package) create another attack surface β€” attackers register @lgger/debug targeting mistyped scope names.

PyPI

PyPI's pip normalises package names (requests, Requests, and REQUESTS are the same package) β€” but only in the package resolution step. The registry still allows registration of confusingly similar names: reqeusts, request, requests-async.

RubyGems

RubyGems attacks are less frequent but documented. The rest-client gem was compromised in 2019 when an attacker gained access to the account and published a backdoored version β€” a hybrid of typosquatting and account takeover tactics.

Payload Patterns in Typosquatted Packages

Typosquatted packages typically have one of three payload patterns:

  1. Immediate exfiltration β€” run at install time via hooks, collect environment variables and developer machine information, send to attacker's server. Short-lived packages that activate once and self-destruct.
  2. Persistent runtime payload β€” the package appears functional (it re-exports the legitimate package) but includes a secondary payload that activates at runtime, harvesting data processed by the application.
  3. Watering hole for CI β€” payload activates only in CI environments where cloud credentials are available as environment variables.

Many typosquatted packages are functionally identical to the legitimate package. They copy the real package's code, add a small malicious addition, and publish. Users who install the typosquatted version get working software β€” and malware. The package may remain undetected for weeks because it doesn't cause obvious failures.

AI-Scaled Typosquatting Campaigns

LLMs have significantly reduced the cost of running typosquatting campaigns. Attackers use AI to:

  • Generate comprehensive lists of plausible typos for every popular package (a task previously done manually)
  • Generate convincing README files and package descriptions that make typosquatted packages look legitimate
  • Produce realistic commit histories and changelogs
  • Write polymorphic payload code that evades static signature scanners

The result is a dramatic increase in the volume and sophistication of typosquatting campaigns. Security researchers reported over 10,000 suspected typosquatting packages removed from PyPI in a single week in 2023.

Detection and Prevention

  • Lock files β€” commit and verify lock files; they specify exact package versions by name and hash. A new package install from a typo will change the lock file, which creates a detectable diff.
  • Private registry mirroring β€” mirror only approved packages in a private registry; developers install from the mirror, never directly from the public registry
  • Dependency allowlisting β€” maintain a list of approved dependencies; any new package must go through an approval process
  • Name similarity alerting β€” tools like confused and AquilaX's supply chain scanner check installed packages against known typosquatting patterns
  • First-install alerting β€” alert on packages being installed for the first time in your environment that are less than 30 days old or have very low download counts

The single most effective control is a private package registry that mirrors approved packages. Developers can only install packages that are in the mirror. New packages require explicit approval. This eliminates the entire typosquatting surface for installation β€” though not for initial PR review where a developer might suggest adding a new dependency.