The False Positive Problem
A SAST tool running in default configuration on a large enterprise codebase produces hundreds or thousands of findings per scan. In practice, 60β90% of these are false positives: findings that are syntactically flagged but not actually exploitable in context.
The consequences are well-documented: developers learn to ignore scanner output, security engineers spend weeks triaging instead of remediating, and the real vulnerabilities hide in the noise. False positive rates above 50% are correlated with programme abandonment within 18 months.
The compounding problem: When developers see a SAST finding they know is a false positive and cannot dismiss it easily, they begin to distrust all findings β including the real ones. Alert fatigue in security is more dangerous than no alerting.
How AI Triage Works
AI triage sits between the scanner output and the developer's view. It classifies each finding as "likely true positive", "likely false positive", or "needs review" before surfacing it to engineers.
The classifier uses contextual features the scanner itself does not evaluate:
- Is the flagged variable ever populated with user-controlled input upstream?
- Does a sanitisation function exist in the call chain?
- Is this a test file or mock where vulnerability impact is irrelevant?
- Has this exact code pattern been previously confirmed as a false positive by a human?
- Does the flagged function execute in an environment where the vulnerability is reachable?
LLM-based classifiers can answer these questions by reasoning over the code context. Fine-tuned smaller models can match LLM accuracy at a fraction of the cost at scale.
Building Your Classifier
Step 1: Label historical findings
Extract your last 6 months of SAST findings with human triage decisions (true positive / false positive). This is your training dataset. You need a minimum of ~500 labelled examples per vulnerability class for a meaningful classifier.
Step 2: Feature extraction
For each finding, extract: the flagged code snippet, the surrounding function, the file path (test vs production), the import list, any sanitisation patterns present, and the call depth from user input to the vulnerable expression.
Step 3: Choose your classification approach
Three options in increasing complexity and cost:
- Rule-based filters β suppress findings in test files, suppress known-safe patterns. Zero ML, high precision for specific cases.
- Fine-tuned classifier β fine-tune a small model (CodeBERT, StarCoder 1B) on your labelled dataset. Fast, cheap to run, good for high-volume suppression.
- LLM-based reasoning β send context to GPT-4o or Claude with a classification prompt. Slower and more expensive but higher accuracy on novel patterns.
Step 4: Confidence thresholds
Suppress findings below 85% confidence in the false positive class only. Show findings above 85% confidence in the true positive class with high priority. Everything in between goes to manual review.
Avoiding Blind Spots
The risk of AI triage is systematic suppression of a real vulnerability class. Several safeguards:
- Never auto-suppress critical severity findings β route to manual review regardless of classifier confidence.
- Monthly accuracy audit β sample 50 auto-suppressed findings per month and have a human verify they are actually false positives.
- Drift detection β if a new code pattern appears that the classifier has never seen, flag for human review rather than suppressing.
- Penetration test feedback loop β when a pen test finds something your SAST missed that was auto-suppressed, feed it back into the classifier as a counterexample.
Never train your triage classifier to optimise for developer satisfaction. A classifier that learns "developers always dismiss XSS findings in the frontend" will start suppressing real XSS vulnerabilities.
Measuring Success
The right metrics for an AI triage programme:
- False positive rate β percentage of surfaced findings that are confirmed false positives after human review. Target: under 20%.
- False negative rate β percentage of suppressed findings that turn out to be real vulnerabilities. Target: under 2%. This is the safety metric.
- Mean time to triage β time from finding generation to developer action. AI triage should cut this by 60%+.
- Developer trust score β monthly survey: "Do you trust that SAST findings are actionable?" Target: 70%+ positive.
"The goal of AI triage is not to eliminate human judgment. It is to ensure that when a human spends 30 minutes on a finding, that time was worth spending."