owasp llm security

🧠 OWASP LLM Security Skill

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "owasp llm security" with this command: npx skills add hack23/homepage/hack23-homepage-owasp-llm-security

🧠 OWASP LLM Security Skill

🎯 Purpose

Enforce LLM-specific security controls aligned with OWASP LLM Top 10 2025.

Key Principle: "LLMs introduce unique security risks requiring specialized controls."

πŸ“š Scope

  • πŸ”“ Prompt Injection Prevention

  • πŸ’Ύ Insecure Output Handling

  • πŸ‹οΈ Model Denial of Service

  • οΏ½οΏ½ Supply Chain Vulnerabilities

  • πŸ”Œ Insecure Plugin Design

  • πŸ“Š Excessive Agency

  • πŸ›‘οΈ Data Leakage Prevention

βš™οΈ Security Rules

MUST Requirements

owasp_llm_top_10_controls: llm01_prompt_injection: input_validation: - sanitize_user_input: remove_injection_attempts - prompt_templates: use_parameterized_prompts - context_isolation: separate_user_system_contexts

detection:
  - monitor_outputs: alert_on_suspicious_responses
  - log_inputs: track_all_prompts_for_analysis
  - anomaly_detection: flag_unusual_input_patterns

llm02_insecure_output_handling: output_validation: - sanitize_before_render: html_encode_llm_outputs - validate_format: check_expected_output_structure - xss_prevention: never_trust_llm_generated_html

secure_integration:
  - escaped_rendering: use_safe_templating_engines
  - csp_headers: content_security_policy_strict

llm03_training_data_poisoning: data_quality: - source_validation: verify_training_data_sources - adversarial_testing: test_for_backdoor_triggers - data_provenance: track_dataset_origins

monitoring:
  - model_behavior: detect_unexpected_outputs
  - regular_retraining: update_with_clean_datasets

llm04_model_denial_of_service: rate_limiting: - per_user_limits: 100_requests_per_hour - cost_caps: maximum_tokens_per_request - timeout_enforcement: 30_second_maximum_response_time

resource_management:
  - queue_management: priority_queues_for_critical_users
  - circuit_breakers: auto_disable_on_abuse_detection

llm05_supply_chain_vulnerabilities: third_party_models: - vendor_assessment: security_evaluation_before_use - model_provenance: verify_official_sources_only - sbom: software_bill_of_materials_for_ai_components

monitoring:
  - dependency_scanning: check_for_vulnerable_libraries
  - model_updates: track_security_patches_from_vendors

llm06_sensitive_info_disclosure: data_protection: - no_pii_training: never_train_on_personal_data - output_filtering: redact_potential_secrets_in_responses - context_limits: limit_context_window_to_reduce_leakage

testing:
  - red_team: attempt_to_extract_training_data
  - regression_tests: verify_no_memorization_of_secrets

llm07_insecure_plugin_design: plugin_security: - least_privilege: plugins_minimal_permissions_required - input_validation: validate_all_plugin_inputs - authentication: require_auth_for_plugin_execution

review_process:
  - security_review: all_plugins_security_audited
  - sandboxing: isolate_plugin_execution_environment

llm08_excessive_agency: authorization: - human_approval: require_approval_for_critical_actions - scope_limits: restrict_llm_to_read_only_operations - audit_trail: log_all_llm_initiated_actions

safeguards:
  - action_validation: confirm_intended_action_before_execution
  - rollback_capability: undo_mechanism_for_llm_actions

llm09_overreliance: human_oversight: - fact_checking: verify_llm_outputs_before_trust - disclaimer: inform_users_llm_may_hallucinate - critical_decisions: never_fully_automate_without_review

llm10_model_theft: access_controls: - api_authentication: require_strong_auth_for_model_access - rate_limiting: prevent_model_extraction_via_queries - watermarking: embed_watermarks_in_model_outputs

MUST NOT Prohibitions

prohibited_llm_practices:

  • unvalidated_prompts: accepting_raw_user_input_to_llm
  • trusting_outputs: using_llm_responses_without_validation
  • no_rate_limits: allowing_unlimited_api_calls
  • exposing_models: public_access_to_model_weights
  • training_on_secrets: including_api_keys_in_training_data
  • unmonitored_usage: no_logging_or_alerting_for_abuse

πŸ’‘ Examples

Example 1: Prompt Injection Prevention

Secure LLM Integration with Input Validation

import re from typing import Optional

def sanitize_user_input(user_input: str) -> str: """Remove potential prompt injection attempts""" # Remove system-like instructions injection_patterns = [ r"ignore previous instructions", r"disregard above", r"you are now", r"system:", r"admin:", ]

cleaned = user_input
for pattern in injection_patterns:
    cleaned = re.sub(pattern, "", cleaned, flags=re.IGNORECASE)

# Limit length to prevent token exhaustion
max_length = 500
cleaned = cleaned[:max_length]

return cleaned.strip()

def safe_llm_query(user_question: str) -> Optional[str]: """Safely query LLM with validated input""" # Validate and sanitize input sanitized_input = sanitize_user_input(user_question)

if not sanitized_input:
    return "Invalid input detected"

# Use parameterized prompt template
system_prompt = "You are a helpful assistant. Answer factually and concisely."
user_prompt = f"User question: {sanitized_input}"

# Query LLM with separated contexts
response = query_llm(
    system=system_prompt,
    user=user_prompt,
    max_tokens=150,
    temperature=0.7
)

# Validate output before returning
if contains_suspicious_content(response):
    log_security_event("Suspicious LLM output detected", response)
    return "Response blocked for security reasons"

return html_escape(response)  # XSS prevention

Example 2: Rate Limiting and DoS Prevention

LLM API Rate Limiting

from flask import Flask, request, jsonify from flask_limiter import Limiter from flask_limiter.util import get_remote_address import time

app = Flask(name)

Rate limiter configuration

limiter = Limiter( app=app, key_func=get_remote_address, default_limits=["100 per hour", "10 per minute"], storage_uri="redis://localhost:6379" )

Cost-based limiting

MAX_TOKENS_PER_REQUEST = 1000 MAX_COST_PER_USER_PER_DAY = 100 # USD

@app.route('/api/llm/query', methods=['POST']) @limiter.limit("10 per minute") def llm_query(): """LLM API endpoint with comprehensive DoS protection""" user_id = request.headers.get('X-User-ID') user_input = request.json.get('prompt', '')

# Token limit enforcement
estimated_tokens = len(user_input.split()) * 1.3  # Rough estimate
if estimated_tokens > MAX_TOKENS_PER_REQUEST:
    return jsonify({
        "error": "Request exceeds maximum token limit"
    }), 400

# Daily cost limit check
user_cost_today = get_user_cost_today(user_id)
if user_cost_today >= MAX_COST_PER_USER_PER_DAY:
    return jsonify({
        "error": "Daily cost limit exceeded"
    }), 429

# Timeout enforcement
start_time = time.time()
timeout_seconds = 30

try:
    response = query_llm_with_timeout(
        user_input,
        timeout=timeout_seconds
    )
    
    # Track usage cost
    cost = calculate_usage_cost(user_input, response)
    record_user_cost(user_id, cost)
    
    return jsonify({
        "response": response,
        "cost": cost,
        "remaining_daily_quota": MAX_COST_PER_USER_PER_DAY - user_cost_today
    })

except TimeoutError:
    return jsonify({
        "error": "Request timed out"
    }), 504

Example 3: Sensitive Data Leakage Prevention

Output Filtering to Prevent Data Leakage

import re from typing import List, Tuple

Patterns for sensitive data detection

SENSITIVE_PATTERNS = { 'api_key': r'[A-Za-z0-9]{32,}', 'email': r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}', 'ssn': r'\b\d{3}-\d{2}-\d{4}\b', 'credit_card': r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b', 'aws_key': r'AKIA[0-9A-Z]{16}', }

def detect_sensitive_data(text: str) -> List[Tuple[str, str]]: """Detect potential sensitive data in LLM output""" findings = []

for data_type, pattern in SENSITIVE_PATTERNS.items():
    matches = re.findall(pattern, text)
    for match in matches:
        findings.append((data_type, match))

return findings

def redact_sensitive_data(text: str) -> str: """Redact sensitive data from LLM output""" redacted = text

for data_type, pattern in SENSITIVE_PATTERNS.items():
    redacted = re.sub(pattern, f"[REDACTED_{data_type.upper()}]", redacted)

return redacted

def safe_llm_output_handler(llm_response: str) -> dict: """Validate and sanitize LLM output before display""" # Detect sensitive data sensitive_findings = detect_sensitive_data(llm_response)

if sensitive_findings:
    # Log security incident
    log_security_event(
        "Sensitive data detected in LLM output",
        findings=sensitive_findings
    )
    
    # Redact sensitive data
    safe_response = redact_sensitive_data(llm_response)
    
    return {
        "response": safe_response,
        "warning": "Some content was redacted for security",
        "redacted_count": len(sensitive_findings)
    }

# HTML escape to prevent XSS
return {
    "response": html_escape(llm_response),
    "warning": None
}

πŸ”— Integration

Policies: AI Policy

Skills: ai-governance, secure-development, data-classification

Frameworks: OWASP LLM Top 10 2025, ISO 27001 A.14

πŸ“‹ Document Control

  • Version: 1.0 | Updated: 2026-02-10

  • License: Apache-2.0

  • Classification:

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

information-security-strategy

No summary provided by upstream source.

Repository SourceNeeds Review
Security

vulnerability-management

No summary provided by upstream source.

Repository SourceNeeds Review
Security

threat-modeling

No summary provided by upstream source.

Repository SourceNeeds Review