What AI Can Actually Build
Modern AI coding assistants β Cursor, Claude, ChatGPT, Copilot β can now generate surprisingly complete applications from a single prompt. A "build me a SaaS task manager with auth, Stripe billing, and a REST API" prompt will produce working code across all layers in minutes.
What does "working" look like?
- Database schema with models and migrations
- Authentication routes with JWT or session management
- CRUD endpoints with basic validation
- Frontend components with state management
- Payment integration with webhook handling
- Docker configuration and basic CI/CD
This is genuinely impressive. A solo developer can go from idea to deployed MVP in days rather than weeks. The speed advantage is real. The problem is that "working" and "secure" are different standards, and AI optimises for the former.
The optimisation gap: AI training data skews toward code that runs, not code that is secure. Stack Overflow, GitHub, and tutorial sites are full of correct-but-insecure examples. AI absorbs all of it.
Measuring Security Debt in AI-Generated Apps
Security debt is the gap between your current security posture and the level required for your threat model. AI-generated apps typically accumulate significant debt quickly because the generation process has no security feedback loop.
A typical finding profile
Running SAST + SCA + secrets scanning on a medium-complexity AI-generated app (auth, API, database, frontend) typically surfaces:
- SAST findings: 15β40 issues β SQL injection, XSS, SSRF, insecure deserialization, path traversal
- SCA findings: 20β80 vulnerable dependencies, typically 3β8 critical severity
- Secrets findings: 2β10 hardcoded credentials or tokens
- IaC findings: 10β30 misconfigurations if infrastructure code was generated
At a conservative estimate of 2 hours per critical finding to triage and fix, a 5-critical-finding app represents 10 hours of security remediation on top of the 2-hour build time. The security cost can exceed the build cost by 5Γ or more.
The invisible debt problem: Most developers running AI-generated apps have never scanned them. They do not know the debt exists until something is breached or a compliance audit surfaces it.
Auth and Session Mistakes
Authentication and session management are areas where AI-generated code is particularly risky. The reason: auth is complex, the patterns vary significantly across frameworks, and the "easy" pattern that appears in tutorials is often subtly wrong.
Common AI auth mistakes
- Weak JWT secrets: Using short, guessable strings like
"secret"or"change-me"as signing keys - Algorithm confusion: Accepting
alg: nonein JWT validation, allowing unsigned tokens - No token expiry: Generating tokens without
expclaims β tokens never expire - Password storage: Occasionally storing plaintext passwords or using weak hashing (MD5, SHA1)
- Session fixation: Not regenerating session IDs after authentication
// AI-generated β multiple issues const token = jwt.sign( { userId: user.id }, "secret", // hardcoded weak secret { algorithm: "HS256" } // no expiry ); // Correct const token = jwt.sign( { userId: user.id, iat: Math.floor(Date.now() / 1000) }, process.env.JWT_SECRET, // from env, min 32 bytes { algorithm: "HS256", expiresIn: "15m" } );
Data Exposure Patterns
AI tends to return entire database objects from API endpoints rather than projecting only the fields the client needs. This is the most common source of data exposure in AI-generated apps.
Over-exposed API responses
When AI generates a user profile endpoint, it typically returns the full user record β including password hash, internal flags, admin status, and other fields the frontend never uses. Any of these extra fields could be exploited by an attacker.
# AI-generated β returns full object @app.get("/users/me") def get_me(user=Depends(get_current_user)): return user # includes hashed_password, is_admin, etc. # Correct β explicit response model class UserPublic(BaseModel): id: int email: str display_name: str @app.get("/users/me", response_model=UserPublic) def get_me(user=Depends(get_current_user)): return user
Dependency Risk
AI selects dependencies based on training data popularity β which means it picks packages that were popular when the training data was collected. Many of these are now unmaintained or have known CVEs.
The outdated package problem
In a typical AI-generated Node.js app, 20β30% of direct dependencies will have at least one known vulnerability. This is not unique to AI β human developers face the same problem β but AI compounds it because it never prompts you to run npm audit.
- AI picks
express 4.xwhenexpress 5.xhas been stable for over a year - AI recommends
jsonwebtoken 8.xβ a version with known algorithm-confusion vulnerabilities - AI selects deprecated image processing libraries that have path traversal CVEs
SCA matters here: Software Composition Analysis (SCA) will immediately surface these β it takes about 30 seconds to run and is the single highest-ROI security check for AI-generated apps.
When AI Genuinely Helps Security
AI is not uniformly bad at security. There are areas where it genuinely improves the security posture of AI-generated apps compared to rushed human code:
- Password hashing: AI almost always uses bcrypt or argon2 correctly when generating user registration flows
- HTTPS by default: AI-generated server configurations typically enforce TLS
- Input validation presence: AI usually adds some level of input validation β even if it is not complete
- CSRF tokens: Modern AI assistants aware of framework conventions will include CSRF protection in form-handling code
- Error message hygiene: AI often avoids returning raw stack traces in production error handlers
The pattern is that AI handles the well-documented, widely-discussed security patterns reasonably well. It fails on the subtle, context-dependent ones β BOLA, business logic flaws, insecure direct object references β that require understanding the application's specific threat model.
A Secure AI App Workflow
You can use AI to build complete apps and still keep security debt manageable. The key is integrating security checks immediately rather than saving them for "later" (which never comes).
Minimum viable secure process
- Generate the app β use AI to scaffold the full codebase
- Run SCA immediately β
npm audit/pip-auditbefore writing a single line of custom code. Fix critical CVEs in generated dependencies now. - Secrets scan before first commit β catch any hardcoded credentials before they enter git history
- SAST on the full codebase β run a static analysis pass to find injection points, auth issues, and data exposure patterns
- Review auth and ownership logic manually β AI auth mistakes cannot be caught by SAST alone. Read the auth flow with fresh eyes.
- Set up CI gates β configure SAST and SCA to block merges on high/critical findings from this point forward
name: Security Gates on: [push, pull_request] jobs: scan: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: SCA - Dependency Audit run: npm audit --audit-level=high - name: Secrets Scan uses: aquilax/scan-action@v1 with: scan-type: secrets,sast fail-on: high
This is not a complete security programme, but it catches the majority of AI-generated vulnerability classes before they reach production. The investment is roughly one hour of setup for potentially dozens of hours of breach response saved.
Scan Your AI-Generated App
AquilaX runs SAST, SCA, and secrets scanning across your entire codebase in minutes. See the real security cost of your AI app.
Start Free Scan