What Is Penetration Testing? A Developer's Guide to Pen Testing

Penetration Test vs Vulnerability Scan

These terms are used interchangeably in casual conversation but they're fundamentally different:

Vulnerability scanning: Automated tool enumerates known CVEs and misconfigurations. Fast, cheap, scalable. Produces a list of potential vulnerabilities — not confirmation they're exploitable.
Penetration testing: A human (or team) simulates a real attacker, chaining vulnerabilities together, exploiting them to demonstrate real-world impact. Manual, slower, more expensive. Produces proof of exploitability.

A CVSS 9.8 vulnerability in your scanning report might be unexploitable because your WAF blocks the attack vector, or because the affected component isn't reachable from the internet. A pen tester would determine this — and might also find a CVSS 6.5 that's actually critical in your specific environment because of how it chains with another issue.

Both are necessary: Vulnerability scanning gives you continuous, broad coverage. Pen testing gives you depth and real-world impact assessment. They serve different purposes and complement each other.

Black Box, White Box, and Gray Box Testing

Black Box

The tester has no prior knowledge of the system — just a target (URL, IP range) and an objective. This best simulates an external attacker who has done their own reconnaissance. It's the most realistic simulation but also the least efficient — the tester spends significant time on reconnaissance that your own team could skip.

White Box

The tester has full access — source code, architecture documentation, credentials, environment details. This maximises coverage for the time spent, since no time is wasted on reconnaissance. Ideal for finding deep vulnerabilities in your own code before release.

Gray Box

The tester has partial knowledge — maybe user credentials but no source code, or network diagrams but no system configuration. Most commercial pen tests are gray box: the tester knows the technology stack and can focus on the attack surface rather than spending time discovering it.

Most teams get the most value from gray box: You get the efficiency of white box (tester knows the stack) with the realism of black box (no source code review). It's the best ROI for most application pen tests.

The Phases of a Penetration Test

1. Reconnaissance

Passive and active information gathering. What subdomains does the target have? What technologies? What employees? What publicly exposed services? This intelligence shapes the rest of the test.

2. Scanning and Enumeration

Active probing to identify open ports, running services, software versions, and potential entry points. Tools: Nmap, Nessus, Shodan, Nuclei.

3. Exploitation

Attempting to exploit discovered vulnerabilities. The tester documents everything — what worked, what didn't, and why. Exploitation serves two purposes: confirming the vulnerability is real, and establishing a foothold for further testing.

4. Post-Exploitation

From a foothold, what can be accessed? Can privileges be escalated? Can the tester reach other systems (lateral movement)? What sensitive data is accessible? This demonstrates the real business impact of a successful initial compromise.

5. Reporting

Comprehensive documentation of all findings with evidence (screenshots, request/response pairs, proof-of-concept code), risk ratings, and remediation guidance. The report is the deliverable — a high-quality pen test with a poor report is worth less than you paid for it.

Web Application Pen Testing

Web app testing is typically the most relevant scope for development teams. A competent web app pen tester will systematically test for:

Authentication and session management flaws
Authorisation bypasses (IDOR, privilege escalation)
Injection vulnerabilities (SQL, NoSQL, LDAP, Command)
XSS (reflected, stored, DOM-based)
CSRF
Business logic flaws (application-specific)
API security issues (rate limiting, mass assignment, excessive data)
Server-side vulnerabilities (SSRF, XXE, deserialization)
Cryptography weaknesses

Business logic testing is where manual pen testing significantly outperforms automated scanners. A scanner doesn't understand that "create a coupon, apply it, then cancel the coupon after the purchase is complete" might result in a free order. A good pen tester does.

Network vs Application vs API Pen Testing

Different scopes focus on different attack surfaces:

Network pen testing: External and internal network infrastructure — firewalls, switches, VPNs, exposed services. Finds network-level vulnerabilities and misconfigurations.
Application pen testing: Web applications, mobile apps, thick clients. Focuses on application-layer vulnerabilities.
API pen testing: REST, GraphQL, SOAP APIs. Tests authentication, authorisation, data exposure, and API-specific vulnerabilities using the OWASP API Security Top 10.
Cloud/infrastructure pen testing: Cloud configuration, IAM, storage, serverless functions. Often requires cloud provider notification in advance.
Social engineering: Phishing, vishing, physical access attempts. Tests the human layer.

Tools Pentesters Use

Burp Suite Pro: The standard for web application and API testing. Intercepts, modifies, and replays HTTP requests. Includes an automated scanner, intruder for brute-forcing, and a massive library of extensions.
Metasploit Framework: The industry-standard exploitation framework. Modules for hundreds of known exploits, post-exploitation tools, and payload generation.
Nmap: Network scanning and service enumeration. Essential for the reconnaissance and scanning phases.
SQLMap: Automated SQL injection detection and exploitation — saves hours on database injection testing.
Nuclei: Fast, template-based vulnerability scanner. Excellent for finding known CVEs and misconfigurations quickly.
Amass / subfinder: Subdomain enumeration — finding all attack surface before testing.

How to Read a Pen Test Report

A good pen test report has two audiences: executive leadership (who need to understand business risk) and engineering (who need to fix specific issues). Structure to expect:

Executive summary: Overall risk level, key findings, remediation priorities. Written for non-technical stakeholders.
Scope and methodology: What was tested, how, from what starting position.
Findings: Each vulnerability with: severity rating, CVSS score, evidence, business impact, step-by-step reproduction, and remediation guidance.
Risk matrix: Summary of findings by severity.

When prioritising remediation: start with critical/high severity findings that are easily exploitable and have significant business impact. Don't just sort by CVSS — contextualise each finding against your actual risk profile.

Red Team vs Pen Test vs Bug Bounty

Pen test: Comprehensive assessment of a defined scope within a fixed time window. Goal: find as many vulnerabilities as possible. Best for compliance requirements and comprehensive coverage.
Red team exercise: Simulates a specific threat actor attempting to achieve a specific objective (steal data, reach a specific system) using any means necessary, often including social engineering and physical access. Goal: test your detection and response capabilities, not just find vulnerabilities.
Bug bounty programme: Ongoing, crowdsourced security testing. Researchers report vulnerabilities in exchange for rewards. Best for continuous coverage after initial pen test findings are remediated.

When and How Often to Run Pen Tests

General guidance:

Annually at minimum for most organisations — more often if you're in a regulated industry
Before major releases of new applications or significant feature changes
After significant infrastructure changes — new cloud environments, major architecture changes
After remediation of critical findings — verify the fix is effective and no new issues were introduced
When required by compliance — PCI-DSS, SOC 2, ISO 27001, HIPAA all have pen testing requirements

Pen tests are point-in-time: A pen test finding a clean bill of health in January doesn't mean you're secure in June after three months of development. This is why continuous automated scanning (SAST, DAST, SCA) between pen tests is essential.

What Is Penetration Testing?A Developer's Guide to Pen Testing