Can AI Build an Entire App? Security Cost and Hidden Risks

What AI Can Actually Build

Modern AI coding assistants — Cursor, Claude, ChatGPT, Copilot — can now generate surprisingly complete applications from a single prompt. A "build me a SaaS task manager with auth, Stripe billing, and a REST API" prompt will produce working code across all layers in minutes.

What does "working" look like?

Database schema with models and migrations
Authentication routes with JWT or session management
CRUD endpoints with basic validation
Frontend components with state management
Payment integration with webhook handling
Docker configuration and basic CI/CD

This is genuinely impressive. A solo developer can go from idea to deployed MVP in days rather than weeks. The speed advantage is real. The problem is that "working" and "secure" are different standards, and AI optimises for the former.

The optimisation gap: AI training data skews toward code that runs, not code that is secure. Stack Overflow, GitHub, and tutorial sites are full of correct-but-insecure examples. AI absorbs all of it.

Hidden Vulnerabilities in AI-Generated Apps

When we run SAST and DAST scans on AI-generated applications, the same vulnerability classes appear repeatedly — not because AI is careless, but because these patterns are over-represented in training data.

SQL Injection via ORM misuse

AI often generates raw query strings when the prompt hints at flexibility or search functionality, even in projects that already use an ORM. The pattern appears when AI tries to be "helpful" about dynamic filters:

                search_handler.py
                Python
              
# AI-generated — vulnerable to SQLi
def search_users(query: str):
    sql = f"SELECT * FROM users WHERE name LIKE '%{query}%'"
    return db.execute(sql)

# Correct — parameterised query
def search_users(query: str):
    return db.execute(
        "SELECT * FROM users WHERE name LIKE :q",
        {"q": f"%{query}%"}
    )

Broken object-level authorisation (BOLA)

AI generates endpoints that trust user-supplied IDs without verifying that the requesting user owns the resource. This is the most common API vulnerability class:

                invoice_routes.py
                Python
              
# AI-generated — no ownership check
@app.get("/invoices/{invoice_id}")
def get_invoice(invoice_id: int, user=Depends(get_current_user)):
    return db.query(Invoice).filter(Invoice.id == invoice_id).first()

# Correct — ownership enforced
@app.get("/invoices/{invoice_id}")
def get_invoice(invoice_id: int, user=Depends(get_current_user)):
    invoice = db.query(Invoice).filter(
        Invoice.id == invoice_id,
        Invoice.owner_id == user.id  # ownership check
    ).first()
    if not invoice:
        raise HTTPException(status_code=404)

Hardcoded secrets

AI frequently embeds placeholder secrets directly in code — JWT signing keys, API keys, database passwords — sometimes with a comment saying "replace this". These often ship to production as-is.

Real-world finding: In a scan of 50 AI-generated hobby projects on GitHub, AquilaX found hardcoded credentials in 34% of repositories. Of those, 12% had credentials that were still active and valid.

Measuring Security Debt in AI-Generated Apps

Security debt is the gap between your current security posture and the level required for your threat model. AI-generated apps typically accumulate significant debt quickly because the generation process has no security feedback loop.

A typical finding profile

Running SAST + SCA + secrets scanning on a medium-complexity AI-generated app (auth, API, database, frontend) typically surfaces:

SAST findings: 15–40 issues — SQL injection, XSS, SSRF, insecure deserialization, path traversal
SCA findings: 20–80 vulnerable dependencies, typically 3–8 critical severity
Secrets findings: 2–10 hardcoded credentials or tokens
IaC findings: 10–30 misconfigurations if infrastructure code was generated

At a conservative estimate of 2 hours per critical finding to triage and fix, a 5-critical-finding app represents 10 hours of security remediation on top of the 2-hour build time. The security cost can exceed the build cost by 5× or more.

The invisible debt problem: Most developers running AI-generated apps have never scanned them. They do not know the debt exists until something is breached or a compliance audit surfaces it.

Auth and Session Mistakes

Authentication and session management are areas where AI-generated code is particularly risky. The reason: auth is complex, the patterns vary significantly across frameworks, and the "easy" pattern that appears in tutorials is often subtly wrong.

Common AI auth mistakes

Weak JWT secrets: Using short, guessable strings like "secret" or "change-me" as signing keys
Algorithm confusion: Accepting alg: none in JWT validation, allowing unsigned tokens
No token expiry: Generating tokens without exp claims — tokens never expire
Password storage: Occasionally storing plaintext passwords or using weak hashing (MD5, SHA1)
Session fixation: Not regenerating session IDs after authentication

                auth.js
                JavaScript
              
// AI-generated — multiple issues
const token = jwt.sign(
  { userId: user.id },
  "secret",          // hardcoded weak secret
  { algorithm: "HS256" }  // no expiry
);

// Correct
const token = jwt.sign(
  { userId: user.id, iat: Math.floor(Date.now() / 1000) },
  process.env.JWT_SECRET,  // from env, min 32 bytes
  { algorithm: "HS256", expiresIn: "15m" }
);

Data Exposure Patterns

AI tends to return entire database objects from API endpoints rather than projecting only the fields the client needs. This is the most common source of data exposure in AI-generated apps.

Over-exposed API responses

When AI generates a user profile endpoint, it typically returns the full user record — including password hash, internal flags, admin status, and other fields the frontend never uses. Any of these extra fields could be exploited by an attacker.

                user_routes.py
                Python
              
# AI-generated — returns full object
@app.get("/users/me")
def get_me(user=Depends(get_current_user)):
    return user  # includes hashed_password, is_admin, etc.

# Correct — explicit response model
class UserPublic(BaseModel):
    id: int
    email: str
    display_name: str

@app.get("/users/me", response_model=UserPublic)
def get_me(user=Depends(get_current_user)):
    return user

Dependency Risk

AI selects dependencies based on training data popularity — which means it picks packages that were popular when the training data was collected. Many of these are now unmaintained or have known CVEs.

The outdated package problem

In a typical AI-generated Node.js app, 20–30% of direct dependencies will have at least one known vulnerability. This is not unique to AI — human developers face the same problem — but AI compounds it because it never prompts you to run npm audit.

AI picks express 4.x when express 5.x has been stable for over a year
AI recommends jsonwebtoken 8.x — a version with known algorithm-confusion vulnerabilities
AI selects deprecated image processing libraries that have path traversal CVEs

SCA matters here: Software Composition Analysis (SCA) will immediately surface these — it takes about 30 seconds to run and is the single highest-ROI security check for AI-generated apps.

When AI Genuinely Helps Security

AI is not uniformly bad at security. There are areas where it genuinely improves the security posture of AI-generated apps compared to rushed human code:

Password hashing: AI almost always uses bcrypt or argon2 correctly when generating user registration flows
HTTPS by default: AI-generated server configurations typically enforce TLS
Input validation presence: AI usually adds some level of input validation — even if it is not complete
CSRF tokens: Modern AI assistants aware of framework conventions will include CSRF protection in form-handling code
Error message hygiene: AI often avoids returning raw stack traces in production error handlers

The pattern is that AI handles the well-documented, widely-discussed security patterns reasonably well. It fails on the subtle, context-dependent ones — BOLA, business logic flaws, insecure direct object references — that require understanding the application's specific threat model.

A Secure AI App Workflow

You can use AI to build complete apps and still keep security debt manageable. The key is integrating security checks immediately rather than saving them for "later" (which never comes).

Minimum viable secure process

Generate the app — use AI to scaffold the full codebase
Run SCA immediately — npm audit / pip-audit before writing a single line of custom code. Fix critical CVEs in generated dependencies now.
Secrets scan before first commit — catch any hardcoded credentials before they enter git history
SAST on the full codebase — run a static analysis pass to find injection points, auth issues, and data exposure patterns
Review auth and ownership logic manually — AI auth mistakes cannot be caught by SAST alone. Read the auth flow with fresh eyes.
Set up CI gates — configure SAST and SCA to block merges on high/critical findings from this point forward

                .github/workflows/security.yml
                YAML
              
name: Security Gates
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: SCA - Dependency Audit
        run: npm audit --audit-level=high
      - name: Secrets Scan
        uses: aquilax/scan-action@v1
        with:
          scan-type: secrets,sast
          fail-on: high

This is not a complete security programme, but it catches the majority of AI-generated vulnerability classes before they reach production. The investment is roughly one hour of setup for potentially dozens of hours of breach response saved.

Can AI Build an Entire App?Security Cost and Hidden Risks