sanitize

Detect and redact personally identifiable information (PII) from text files.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "sanitize" with this command: npx skills add agentward-ai/agentward/agentward-ai-agentward-sanitize

AgentWard Sanitize

Detect and redact personally identifiable information (PII) from text files.

IMPORTANT — PII Safety Rules

  • Do NOT read the input file directly. It may contain sensitive PII.

  • ALWAYS use --output FILE to write sanitized output to a file.

  • Only read the OUTPUT file, never the raw input.

  • Only show the user the redacted output, never the raw input.

  • --json and --preview are safe — they do NOT print raw PII values to stdout.

  • The entity map (raw PII → placeholder mapping) is written to a separate sidecar file (*.entity-map.json ) only when --output is used. Do NOT read the entity map file.

What it does

Scans files for PII — credit cards, SSNs, emails, phone numbers, API keys, IP addresses, mailing addresses, dates of birth, passport numbers, driver's license numbers, bank routing numbers, medical license numbers, and insurance member IDs — and replaces each instance with a numbered placeholder like [CREDIT_CARD_1] .

Usage

Sanitize a file (RECOMMENDED — always use --output)

python scripts/sanitize.py patient-notes.txt --output clean.txt

Preview mode (detect PII categories/offsets without showing raw values)

python scripts/sanitize.py notes.md --preview

JSON output (safe — no raw PII in stdout)

python scripts/sanitize.py report.txt --json --output clean.txt

Filter to specific categories

python scripts/sanitize.py log.txt --categories ssn,credit_card,email --output clean.txt

Supported PII categories

See references/SUPPORTED_PII.md for the full list with detection methods and false positive mitigation.

Category Pattern type Example

credit_card

Luhn-validated 13-19 digits 4111 1111 1111 1111

ssn

3-2-4 digit groups 123-45-6789

cvv

Keyword-anchored 3-4 digits CVV: 123

expiry_date

Keyword-anchored MM/YY expiry 01/30

api_key

Provider prefix patterns sk-abc..., ghp_..., AKIA...

email

Standard email format user@example.com

phone

US/intl phone numbers +1 (555) 123-4567

ip_address

IPv4 addresses 192.168.1.100

date_of_birth

Keyword-anchored dates DOB: 03/15/1985

passport

Keyword-anchored alphanumeric Passport: AB1234567

drivers_license

Keyword-anchored alphanumeric DL: D12345678

bank_routing

Keyword-anchored 9 digits routing: 021000021

address

Street + city/state/zip 742 Evergreen Terrace Dr, Springfield, IL 62704

medical_license

Keyword-anchored license ID License: CA-MD-8827341

insurance_id

Keyword-anchored member/policy ID Member ID: BCB-2847193

Security and Privacy

  • All processing is local. The script makes zero network calls. No data leaves your machine.

  • Zero dependencies. Uses only Python standard library — no third-party packages to audit.

  • PII never reaches stdout. The --json and --preview modes strip raw PII values from output. The entity map (containing raw PII to placeholder mappings) is only written to a sidecar file on disk when --output is used.

  • Designed for agent safety. The skill instructions above tell the agent to never read the raw input file or the entity map file — only the sanitized output.

Requirements

  • Python 3.11+

  • No external dependencies (stdlib only)

About

Built by AgentWard — the open-source permission control plane for AI agents.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

sanitize

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

Cypress Agent Skill

Production-grade Cypress E2E and component testing — selectors, network stubbing, auth, CI parallelization, flake elimination, Page Object Model, and TypeScr...

Registry SourceRecently Updated
Automation

Ichiro-Mind

Ichiro-Mind: The ultimate unified memory system for AI agents. 4-layer architecture (HOT→WARM→COLD→ARCHIVE) with neural graph, vector search, experience lear...

Registry SourceRecently Updated
1128
hudul