Why DevSecOps Metrics Matter
Most security programmes generate enormous amounts of activity β scans running, findings reported, tickets created β without anyone being able to answer the question that actually matters: are we getting more secure over time? Metrics bridge the gap between security activity and security outcomes.
Without metrics, security teams argue for resources based on compliance requirements and gut feel. With metrics, they can demonstrate that MTTR for critical vulnerabilities dropped from 45 days to 12 days last quarter, that the vulnerability escape rate to production fell by 60% after introducing pre-commit scanning, and that 78% of developers are actively using the IDE security plugin. These are arguments that resonate with engineering leadership and boards.
Goodhart's Law applies here: "When a measure becomes a target, it ceases to be a good measure." Optimising for a metric at the expense of actual security outcomes is counterproductive. The fix-rate going up because developers are closing tickets as "won't fix" rather than actually remediating is worse than having no metric at all.
Choose metrics that measure outcomes, not outputs. Scanning coverage (output) matters less than vulnerability escape rate (outcome). Number of findings reported (output) matters less than MTTR (outcome). Build your metric set around what changes in the real world, not what your tools can easily report.
Mean Time to Remediate (MTTR) by Severity
MTTR measures the average time between a vulnerability being discovered and it being fully remediated β patched, mitigated, or accepted with documented risk. It is the single most important DevSecOps metric because it directly measures how effectively your organisation reduces exposure time.
What it measures: The speed of your remediation workflow from discovery to resolution.
How to calculate: (Sum of (resolution_date - discovery_date) for all resolved vulns) / count(resolved vulns)
What good looks like: Critical: under 7 days. High: under 30 days. Medium: under 90 days. Low: under 180 days.
Always segment MTTR by severity. An average MTTR across all severities hides the most important signal: how quickly are critical vulnerabilities being addressed? A team remediating Criticals in 3 days and Lows in 400 days has a very healthy security posture, but a single MTTR number of 80 days makes them look bad.
Improving MTTR
The biggest drivers of slow MTTR are: finding-to-developer routing friction (developers don't know a finding is theirs to fix), lack of remediation guidance (the finding says "SQL injection" but doesn't explain how to fix it in the specific context), and no escalation process for stale findings. Addressing these three things typically reduces MTTR by 40β60% without any additional tooling.
SELECT severity, COUNT(*) AS resolved_count, ROUND( AVG( EXTRACT(EPOCH FROM (resolved_at - discovered_at)) / 86400 ), 1 ) AS mttr_days FROM vulnerability_findings WHERE status = 'resolved' AND resolved_at >= NOW() - INTERVAL '90 days' GROUP BY severity ORDER BY CASE severity WHEN 'critical' THEN 1 WHEN 'high' THEN 2 WHEN 'medium' THEN 3 WHEN 'low' THEN 4 END;
Vulnerability Escape Rate: What's Reaching Production?
Vulnerability escape rate measures the proportion of vulnerabilities that are first discovered in production (or by external reporters) rather than earlier in the development pipeline. A high escape rate means your pre-production controls aren't catching issues β vulnerabilities are slipping through SAST, DAST, and code review and only being found after deployment.
What it measures: The effectiveness of your shift-left security controls.
How to calculate: (Vulnerabilities first found in production / Total vulnerabilities found) Γ 100
What good looks like: Under 10% escape rate means your pre-production controls are catching the vast majority of issues. Above 30% suggests your pipeline controls need significant improvement.
Escape rate is particularly useful for demonstrating the ROI of pre-production security investments. If you introduce a new SAST tool and the escape rate drops from 25% to 8%, that's a concrete demonstration that the tool is preventing vulnerabilities from reaching production β where they're expensive to fix and potentially exploitable.
Track escape rate by vulnerability type to identify systematic gaps. If SQL injection vulnerabilities have a 40% escape rate but XSS has a 5% escape rate, your SAST tool likely has strong XSS rules but weak SQL injection rules, or your specific SQL library isn't covered by the analyser's patterns.
Security Debt: Measuring the Backlog
Security debt is the accumulation of known, unresolved vulnerabilities β findings that have been discovered but not yet remediated. Like technical debt, it grows silently and compounds over time: older vulnerabilities are harder to fix as code evolves, and a large backlog can become so daunting that teams stop prioritising it entirely.
What it measures: The accumulated risk exposure from unresolved findings.
How to calculate: Count of open findings, segmented by severity and age. A weighted score can aggregate across severities: (Critical Γ 10) + (High Γ 3) + (Medium Γ 1) + (Low Γ 0.1)
What good looks like: Zero Critical findings older than 7 days. Total debt trending down or flat quarter-over-quarter.
Measure debt trends, not just debt levels. An organisation discovering and remediating 100 vulnerabilities per week with 50 open at any time is in a much healthier position than one discovering 10 per week with 200 open. The rate of debt creation versus debt resolution tells the real story.
Security Debt by Team or Service
Segment security debt by owning team or service to create accountability. Publishing a "security debt leaderboard" (or conversely, a list of teams with the lowest debt) gives engineering leadership visibility into which teams need support and creates positive competitive pressure. Teams don't like being last on the list.
Debt accumulation risk: A security debt backlog exceeding 18 months of normal remediation velocity creates a practical trap β the team can never catch up while also keeping pace with new findings. If you're in this situation, a focused debt reduction sprint (dedicate one developer per team for a quarter solely to security debt) is often more effective than incremental progress.
False Positive Rate: Is Your Tooling Creating Noise?
False positive rate measures the proportion of scanner findings that are not actually vulnerabilities β findings that are technically triggered but don't represent exploitable issues in context. High false positive rates are one of the leading causes of DevSecOps programme failure: when developers learn that most security findings are noise, they stop triaging them.
What it measures: The signal-to-noise ratio of your security tooling.
How to calculate: (Findings marked as false positive / Total findings) Γ 100, sampled over a review period.
What good looks like: Under 15% false positive rate. Above 30% requires immediate tool tuning or replacement.
Measure false positive rate per tool, not in aggregate. A SAST tool with a 40% FP rate and a dependency scanner with a 5% FP rate both contribute to developer fatigue, but they require different interventions. The SAST tool needs rule tuning; the dependency scanner might just need a few ignores for known non-exploitable CVEs.
False positive rate is also a useful signal for evaluating new tools. When evaluating a new SAST vendor, run it against a codebase you know well and manually triage a sample of findings. A tool claiming high detection rates with a 50% false positive rate is not useful at scale β developer time spent on false positives is security investment with negative returns.
Developer Adoption: Are Devs Actually Using the Tools?
A security tool that runs in CI/CD but whose findings are never addressed is not providing security value β it's providing an audit trail. Developer adoption metrics measure whether security tooling is integrated into how developers actually work, not just whether it's technically deployed.
What it measures: Actual developer engagement with security tooling and findings.
How to calculate: Percentage of developers with the IDE plugin installed and active; percentage of security findings viewed within 48 hours of creation; percentage of MRs with security findings that receive a comment from the developer.
What good looks like: IDE plugin adoption above 80%. Finding view rate above 90% within 5 days. Developer-acknowledged findings above 85%.
Developer adoption directly predicts MTTR. Teams with high adoption rates β where developers see security findings as part of their normal workflow rather than an external audit function β remediate faster because there's no routing friction and no cultural resistance. Adoption is the leading indicator; MTTR is the lagging indicator.
Improving Developer Adoption
- Surface findings where developers already work: IDE plugins, pull request comments, and Slack/Teams notifications outperform separate security portals
- Provide remediation guidance alongside findings β "here's the fix" improves adoption more than "here's the problem"
- Set up automatic ticket creation in the team's existing issue tracker β finding-to-ticket friction is a major adoption killer
- Run regular developer security workshops that explain why findings matter in the context of your actual product
Coverage Metrics: What Percentage of Code Is Scanned?
Coverage metrics answer a basic but critical question: are your security controls actually reaching all the code and infrastructure that matters? A security programme that scans 40% of repositories provides a false sense of security β the 60% unscanned is where attackers will find the easiest targets.
Track multiple dimensions of coverage:
- Repository coverage: What percentage of active repositories have at least one security scanner configured?
- Scanner type coverage: For repositories with scanning, which scanner types are active? A repository with SAST but no dependency scanning has a significant blind spot.
- Branch coverage: Are scans running on all feature branches, or only on main? Pre-merge scanning is significantly more effective at preventing vulnerabilities than post-merge scanning.
- Infrastructure coverage: What percentage of IaC code (Terraform, Kubernetes manifests, CloudFormation) is scanned for misconfigurations?
Target 100% for critical systems: Full coverage of every repository is the goal, but prioritise. Tier 1 systems (production, customer data, authentication) should have 100% coverage across all scanner types. Tier 2 and 3 systems can have a lighter-touch baseline while you expand coverage systematically.
DORA Metrics and Security: The Connection
The DORA (DevOps Research and Assessment) metrics β deployment frequency, lead time for changes, change failure rate, and mean time to recovery β are the standard framework for measuring software delivery performance. Security is deeply intertwined with all four.
Elite DORA performers (high deployment frequency, low change failure rate) are also better security performers. The same practices that enable fast, reliable deployments β automated testing, small changes, rapid feedback loops β also enable fast vulnerability detection and remediation. Security and delivery velocity are not in tension; they reinforce each other when the tooling and culture are right.
Change Failure Rate and Security
Change failure rate measures the percentage of deployments that cause a production incident. Security vulnerabilities that reach production are a form of change failure. Tracking security-related production incidents as a component of change failure rate gives security a seat at the DORA metrics table β framing security outcomes in language engineering leadership already measures and cares about.
MTTR and Security Incidents
DORA's MTTR (mean time to recover from incidents) overlaps with security incident response time. A team with strong DORA performance β clear ownership, automated rollback, rapid deployment β also recovers faster from security incidents. Building security incident response drills into your broader incident response programme rather than treating them separately improves both metrics.
Building a DevSecOps Metrics Dashboard
A metrics dashboard is only valuable if it drives decisions. Design your dashboard for its audience: security engineers need operational metrics (current open Criticals, MTTR trends, FP rate by tool); engineering managers need team-level metrics (debt by service, developer adoption); executives need outcome metrics (escape rate trend, overall security debt, programme coverage).
dashboard: title: DevSecOps Programme Metrics refresh: 1h panels: - title: Open Critical Vulnerabilities type: stat thresholds: - value: 0 color: green - value: 1 color: red - title: MTTR by Severity (30-day rolling) type: table columns: [severity, mttr_days, target_days, delta] - title: Vulnerability Escape Rate (90-day trend) type: timeseries target_line: 10 # 10% target - title: Repository Coverage by Scanner Type type: heatmap axes: [team, scanner_type] - title: Developer Adoption Rate type: gauge max: 100 thresholds: - value: 80 color: green - value: 50 color: yellow - value: 0 color: red
Start with three to five metrics and mature the dashboard over time. Trying to track everything immediately produces a noisy dashboard that nobody reads. Begin with MTTR by severity, escape rate, and repository coverage β these three together give a reasonable picture of programme health. Add false positive rate and developer adoption as your data collection matures.
Review metrics in a regular cadence β a monthly security metrics review with engineering leadership maintains accountability and creates space to discuss trends, blockers, and resource needs. Metrics without a review process are just numbers in a database.
Measure Your DevSecOps Programme
AquilaX provides built-in security metrics dashboards β track MTTR, vulnerability escape rate, coverage, and developer adoption out of the box, without building a bespoke data pipeline.
Start Free Scan