Input Sanitisation - Injection Prevention Patterns

Defence-in-depth patterns preventing injection attacks across the full stack. Complements data-validation (which checks shape/type) by ensuring data is safe for its destination context (HTML, SQL, shell, URL).

Description

Covers XSS, SQL injection, command injection, URL redirect, and SSRF prevention patterns for the Next.js frontend and FastAPI backend. Enforces output encoding, parameterised queries, and safe subprocess handling aligned with OWASP Top 10 guidelines.

When to Apply

Positive Triggers

Rendering user-generated content in HTML
Constructing database queries with user input
Building shell commands or subprocess calls
Handling URL parameters or redirect targets
Reviewing code for OWASP Top 10 vulnerabilities
User mentions: "XSS", "injection", "sanitise", "security", "escape", "OWASP"

Negative Triggers

Validating data shape or type (use data-validation instead)
Classifying error responses (use error-taxonomy instead)
Configuring authentication or RBAC (use auth patterns directly)
Setting up CORS or rate limiting (already handled in middleware)

Core Principle

Validation checks what data IS. Sanitisation ensures data is SAFE for its destination.

                      ┌─────────────┐

User Input ──► Validate ──► Sanitise ──► Use in Context (shape) (safety) (HTML/SQL/shell/URL)

Attack Vector 1: Cross-Site Scripting (XSS)

The Threat

Untrusted data rendered as HTML can execute arbitrary JavaScript in the user's browser.

React's Built-In Protection

React escapes all JSX expressions by default. This is safe:

// SAFE: React auto-escapes this function UserComment({ text }: { text: string }) { return <p>{text}</p>; }

// Input: <script>alert('XSS')</script> // Output: <script>alert('XSS')</script>

Dangerous Exceptions

These patterns bypass React's escaping and require manual sanitisation:

// DANGEROUS: dangerouslySetInnerHTML // Only use with sanitised content import DOMPurify from 'dompurify';

function RichContent({ html }: { html: string }) { const clean = DOMPurify.sanitize(html, { ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p', 'br', 'ul', 'li'], ALLOWED_ATTR: ['href', 'target', 'rel'], }); return <div dangerouslySetInnerHTML={{ __html: clean }} />; }

// DANGEROUS: href with user input (javascript: protocol) function UserLink({ url }: { url: string }) { // Validate protocol before rendering const safeUrl = /^https?:///i.test(url) ? url : '#'; return <a href={safeUrl}>{url}</a>; }

Content Security Policy (CSP)

Add CSP headers in next.config.ts to prevent inline script execution:

// next.config.ts const securityHeaders = [ { key: 'Content-Security-Policy', value: [ "default-src 'self'", "script-src 'self' 'nonce-{random}'", "style-src 'self' 'unsafe-inline'", // Required for Tailwind "img-src 'self' data: https:", "connect-src 'self' http://localhost:8000", "frame-ancestors 'none'", ].join('; '), }, ];

Attack Vector 2: SQL Injection

The Threat

Untrusted data interpolated into SQL queries can read, modify, or delete database content.

SQLAlchemy Protection (Already in Use)

The project uses SQLAlchemy ORM which parameterises queries by default:

SAFE: SQLAlchemy ORM (parameterised automatically)

result = await session.execute( select(User).where(User.email == user_email) )

SAFE: SQLAlchemy text() with bound parameters

from sqlalchemy import text result = await session.execute( text("SELECT * FROM users WHERE email = :email"), {"email": user_email} )

Dangerous Patterns (NEVER USE)

DANGEROUS: f-string interpolation in SQL

query = f"SELECT * FROM users WHERE email = '{user_email}'"

DANGEROUS: .format() in SQL

query = "SELECT * FROM users WHERE email = '{}'".format(user_email)

DANGEROUS: % formatting in SQL

query = "SELECT * FROM users WHERE email = '%s'" % user_email

Detection Rule

Grep for SQL injection risks:

rg "f['"].*SELECT|.format(.*SELECT|%.*SELECT" apps/backend/src/

Any matches are Critical security findings.

Attack Vector 3: Command Injection

The Threat

Untrusted data in shell commands can execute arbitrary system commands.

Safe Subprocess Calls

import subprocess import shlex

SAFE: List form (no shell interpretation)

subprocess.run( ["git", "log", "--oneline", "-n", str(count)], capture_output=True, text=True )

SAFE: shlex.quote for unavoidable string commands

filename = shlex.quote(user_filename) subprocess.run(f"wc -l {filename}", shell=True)

Dangerous Patterns (NEVER USE)

DANGEROUS: Unquoted user input in shell

subprocess.run(f"cat {user_input}", shell=True)

DANGEROUS: os.system with user input

import os os.system(f"rm {filename}")

Detection Rule

rg "os.system(|shell=True" apps/backend/src/

Attack Vector 4: URL/Redirect Injection

The Threat

Open redirects allow attackers to redirect users to malicious sites after login.

Safe Redirect Pattern

// Whitelist allowed redirect destinations const ALLOWED_HOSTS = ['localhost:3000', 'yourdomain.com.au'];

Backend Redirect Validation

from urllib.parse import urlparse

ALLOWED_HOSTS = {"localhost", "yourdomain.com.au"}

def validate_redirect(url: str, default: str = "/") -> str: """Validate redirect URL against whitelist.""" try: parsed = urlparse(url) if parsed.hostname in ALLOWED_HOSTS or not parsed.hostname: return url except ValueError: pass return default

Attack Vector 5: Server-Side Request Forgery (SSRF)

The Threat

User-controlled URLs in server-side requests can access internal services.

Safe URL Fetching

import ipaddress from urllib.parse import urlparse

BLOCKED_RANGES = [ ipaddress.ip_network("10.0.0.0/8"), ipaddress.ip_network("172.16.0.0/12"), ipaddress.ip_network("192.168.0.0/16"), ipaddress.ip_network("127.0.0.0/8"), ipaddress.ip_network("169.254.0.0/16"), ]

def is_safe_url(url: str) -> bool: """Reject URLs pointing to internal/private networks.""" parsed = urlparse(url) if parsed.scheme not in ("http", "https"): return False try: ip = ipaddress.ip_address(parsed.hostname) return not any(ip in net for net in BLOCKED_RANGES) except ValueError: # Hostname, not IP — allow (DNS resolution happens later) return True

Sanitisation Checklist

When reviewing code for injection risks:

No dangerouslySetInnerHTML without DOMPurify
No javascript: or data: URLs in href /src attributes
No f-string/format SQL — all queries use ORM or bound parameters
No shell=True with user input — use list form or shlex.quote
No open redirects — all redirect targets validated against whitelist
No user-controlled URLs in server-side fetch /requests without SSRF check
CSP headers configured in Next.js

Anti-Patterns

Pattern Problem Correct Approach

String concatenation in SQL (f"SELECT ... WHERE id = '{user_id}'" ) SQL injection vulnerability Use SQLAlchemy ORM or bound parameters with text()

innerHTML or dangerouslySetInnerHTML for user content without sanitisation XSS attack vector Sanitise with DOMPurify before rendering

No output encoding for context (HTML, URL, shell) Injection in the destination context Encode output appropriate to context: HTML-escape, URL-encode, shlex.quote

Trusting client-side validation as the only defence Attackers bypass the browser entirely Always re-validate and sanitise on the server side

shell=True with user-controlled arguments Command injection vulnerability Use list-form subprocess calls or shlex.quote

Checklist

All SQL queries use parameterised queries (ORM or bound parameters)
XSS output encoding applied (DOMPurify for rich HTML, React auto-escaping elsewhere)
Command injection prevention verified (no shell=True with user input)
OWASP Top 10 input handling coverage reviewed
Redirect URLs validated against an allow-list
CSP headers configured in next.config.ts

Response Format

[AGENT_ACTIVATED]: Input Sanitisation [PHASE]: {Audit | Implementation | Review} [STATUS]: {in_progress | complete}

{security analysis or implementation guidance}

[NEXT_ACTION]: {what to do next}

Integration Points

Data Validation

data-validation runs first (checks shape), then input-sanitisation ensures safety:

Input → data-validation (is it valid?) → input-sanitisation (is it safe?) → Use

Error Taxonomy

Sanitisation failures should use AUTH_PERMISSION_* or DATA_VALIDATION_* error codes. Never reveal internal details in error messages to untrusted clients.

Council of Logic (Turing Check)

Sanitisation functions must be O(n) — no recursive regex or backtracking patterns that could cause ReDoS (Regular Expression Denial of Service).

Australian Localisation (en-AU)

Spelling: sanitisation, authorisation, defence, analyse, centre, colour
Compliance: Privacy Act 1988, Australian Cyber Security Centre (ACSC) guidelines
Tone: Direct, security-conscious — state risks clearly

input-sanitisation

Safety Notice

Copy this and send it to your AI assistant to learn

SAFE: SQLAlchemy ORM (parameterised automatically)

SAFE: SQLAlchemy text() with bound parameters

DANGEROUS: f-string interpolation in SQL

DANGEROUS: .format() in SQL

DANGEROUS: % formatting in SQL

SAFE: List form (no shell interpretation)

SAFE: shlex.quote for unavoidable string commands

DANGEROUS: Unquoted user input in shell

DANGEROUS: os.system with user input

Source Transparency

Related Skills

genesis-orchestrator

council-of-logic

xaem-theme-ui