Natural language security search over AppSec data.

Security teams shouldn't need a SQL expert to query their scan results. The Query model (internally named NL-PGSQL) enables anyone on the team to ask questions about findings, trends, and exposure in natural language — and receive a precise PostgreSQL query they can run against the AquilaX data store.

This powers the analytics layer of Securitron: when engineers ask questions in the chat interface about scan trends or repository risk rankings, the QnA model hands off data retrieval tasks to NL-PGSQL, which generates the appropriate query and returns structured results.

Model ID: AquilaX-AI/NL-PGSQL — available on HuggingFace. Base architecture: google/flan-t5-base. Task: text-to-SQL (natural language → PostgreSQL). Trained on AquilaX's security data schema.

Security Query Model architecture.

NL-PGSQL is fine-tuned on google/flan-t5-base — a 250M parameter encoder-decoder model from Google pre-trained on 1,800+ NLP tasks via instruction tuning. FLAN-T5's instruction-following capability makes it well-suited for constrained text generation tasks like SQL synthesis, where the output must follow strict grammatical rules.

Training details:

Base model: google/flan-t5-base (250M parameters, encoder-decoder)
Task: Seq2Seq — natural language question → PostgreSQL statement
Training split: 90% train / 10% validation
Evaluation metric: SacreBLEU — measures token-level overlap between generated and reference SQL
Training data: Paired NL/SQL examples drawn from AquilaX's security schema (findings, scans, repositories, organisations, severity distributions)

5-step inference pipeline.

Every natural language query goes through a standardised 5-step pipeline before the SQL statement is returned:

1

Preprocess

Normalise the input — strip extra whitespace, lower-case, and resolve abbreviations (e.g. "crit" → "critical", "repo" → "repository").

2

Add task prefix

Prepend the model's task instruction: "Translate the following text to PGSQL: ". This activates the fine-tuned SQL generation behaviour.

3

Tokenise

Encode the prefixed input with the FLAN-T5 tokeniser. The maximum input length is 512 tokens; longer inputs are truncated with a warning.

4

Generate

Run encoder-decoder inference with beam search (beam width 4). The decoder generates SQL tokens one at a time until an EOS token or max length is reached.

5

Decode

Convert output token IDs back to text with skip_special_tokens=True. Post-process to ensure valid SQL syntax — validate brackets, aliases, and table name references.

Query examples.

Natural Language

Show me all critical findings from the last 7 days, ordered by repository

SELECT f.*, r.name AS repo_name FROM findings f JOIN repositories r ON f.repo_id = r.id WHERE f.severity = 'CRITICAL' AND f.created_at >= NOW() - INTERVAL '7 days' ORDER BY r.name ASC;

Natural Language

Which repositories have the most unresolved high severity findings?

SELECT r.name, COUNT(f.id) AS finding_count FROM findings f JOIN repositories r ON f.repo_id = r.id WHERE f.severity = 'HIGH' AND f.status != 'resolved' GROUP BY r.name ORDER BY finding_count DESC LIMIT 10;

Natural Language

What is the average security score across all repositories in my organisation?

SELECT AVG(security_score) AS avg_score FROM repositories WHERE org_id = $1 AND last_scanned_at IS NOT NULL;

Natural Language

Count findings by CWE type for the past month

SELECT cwe_id, cwe_name, COUNT(*) AS count FROM findings WHERE created_at >= NOW() - INTERVAL '30 days' GROUP BY cwe_id, cwe_name ORDER BY count DESC;

Running inference.

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_id = "AquilaX-AI/NL-PGSQL"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
model.eval()

def nl_to_sql(question: str) -> str:
    # Step 1: Preprocess and add task prefix
    prefixed = f"Translate the following text to PGSQL: {question.strip()}"

    # Step 2: Tokenise
    inputs = tokenizer(
        prefixed,
        return_tensors="pt",
        max_length=512,
        truncation=True
    )

    # Step 3: Generate with beam search
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        num_beams=4,
        early_stopping=True
    )

    # Step 4: Decode
    sql = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return sql

# Example usage
question = "Which repositories have critical SQL injection findings?"
sql_query = nl_to_sql(question)
print(sql_query)
# Output: SELECT r.name FROM repositories r JOIN findings f ON f.repo_id = r.id
#         WHERE f.cwe_id = 89 AND f.severity = 'CRITICAL';

Evaluation and limitations.

The model is evaluated on a held-out 10% validation set using SacreBLEU, which measures how closely the generated SQL matches reference queries at the token level. A SacreBLEU score above 40 is generally considered good for text-to-SQL tasks.

Scope limitation: NL-PGSQL is trained specifically on AquilaX's security data schema. It produces reliable queries for findings, repositories, scans, organisations, and severity data — but should not be used as a general-purpose NL-to-SQL translator for arbitrary database schemas.

Generated queries should be reviewed before execution in production environments. The AquilaX platform runs all generated queries through a validation layer that checks for syntax errors, verifies table and column references, and enforces row-level security based on the requesting user's organisation scope.

"The goal was never to replace SQL developers — it was to let a Head of Engineering ask 'how many critical findings were introduced this sprint?' without filing a ticket to the data team."

Security Query Model —
search your AppSec data
in plain English.

Natural language security search over AppSec data.

Security Query Model architecture.

5-step inference pipeline.

Preprocess

Add task prefix

Tokenise

Generate

Decode

Query examples.

Running inference.

Evaluation and limitations.

Query your security data without SQL.

Security Query Model —search your AppSec datain plain English.

Natural language security search over AppSec data.

Security Query Model architecture.

5-step inference pipeline.

Preprocess

Add task prefix

Tokenise

Generate

Decode

Query examples.

Running inference.

Evaluation and limitations.

Query your security data without SQL.

Security Query Model —
search your AppSec data
in plain English.