analyzing-malicious-pdf-with-peepdf

Perform static analysis of malicious PDF documents using peepdf, pdfid, and pdf-parser to extract embedded JavaScript, shellcode, and suspicious objects.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "analyzing-malicious-pdf-with-peepdf" with this command: npx skills add mukul975/anthropic-cybersecurity-skills/mukul975-anthropic-cybersecurity-skills-analyzing-malicious-pdf-with-peepdf

Analyzing Malicious PDF with peepdf

When to Use

  • When triaging suspicious PDF attachments from phishing emails
  • During malware analysis of PDF-based exploit documents
  • When extracting embedded JavaScript, shellcode, or executables from PDFs
  • For forensic examination of weaponized document artifacts
  • When building detection signatures for PDF-based threats

Prerequisites

  • Python 3.8+ with peepdf-3 installed (pip install peepdf-3)
  • pdfid.py and pdf-parser.py from Didier Stevens suite
  • Isolated analysis environment (VM or sandbox)
  • Optional: PyV8 for JavaScript emulation within peepdf
  • Optional: Pylibemu for shellcode analysis

Workflow

  1. Triage with pdfid: Scan PDF for suspicious keywords (/JS, /JavaScript, /OpenAction, /Launch, /EmbeddedFile).
  2. Interactive Analysis: Open PDF in peepdf interactive mode to explore object structure.
  3. Identify Suspicious Objects: Locate objects containing JavaScript, streams, or encoded data.
  4. Extract Content: Dump suspicious streams and decode filters (FlateDecode, ASCIIHexDecode).
  5. Deobfuscate JavaScript: Analyze extracted JS for shellcode, heap sprays, or exploit code.
  6. Check VirusTotal: Use peepdf vtcheck to cross-reference file hash with AV detections.
  7. Generate IOCs: Extract URLs, domains, hashes, and shellcode signatures.

Key Concepts

ConceptDescription
/OpenActionAutomatic action executed when PDF is opened
/JavaScript /JSEmbedded JavaScript code in PDF objects
/LaunchAction that launches external applications
/EmbeddedFileFile embedded within the PDF structure
FlateDecodezlib compression filter used to hide content
Object StreamsPDF objects stored in compressed streams

Tools & Systems

ToolPurpose
peepdf / peepdf-3Interactive PDF analysis with JS emulation
pdfid.pyQuick triage scanning for suspicious keywords
pdf-parser.pyDeep object-level PDF parsing
VirusTotalHash lookup and AV detection cross-reference
CyberChefDecode and transform extracted payloads

Output Format

Analysis Report: PDF-MAL-[DATE]-[SEQ]
File: [filename.pdf]
SHA-256: [hash]
Suspicious Keywords: [/JS, /OpenAction, etc.]
Objects with JavaScript: [Object IDs]
Extracted URLs: [List]
Shellcode Detected: [Yes/No]
Embedded Files: [Count and types]
VirusTotal Detections: [X/Y engines]
Risk Level: [Critical/High/Medium/Low]

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

analyzing-certificate-transparency-for-phishing

No summary provided by upstream source.

Repository SourceNeeds Review
Security

analyzing-android-malware-with-apktool

No summary provided by upstream source.

Repository SourceNeeds Review
Security

analyzing-network-traffic-with-wireshark

No summary provided by upstream source.

Repository SourceNeeds Review
Security

analyzing-api-gateway-access-logs

No summary provided by upstream source.

Repository SourceNeeds Review