Process File Skill
Generic file processing Skill supporting multiple file formats for parsing and intelligent archiving, fully integrated with the AkashicRecords governance system.
When to use this Skill
-
User says "read", "process"
-
User says "archive", "import"
-
User provides file path for processing
-
User wants to integrate external files into knowledge base
-
User provides email, PDF, Office documents, images, etc.
Workflow
- Initialization - Read Preferences
Check claude.md:
-
Read current project's claude.md
-
Look for file-handling-preferences related record
-
If path found, read preferences file
If no preferences file exists:
-
Ask user: "This is the first time using process-file skill in this project. Where would you like to create the file handling preferences?"
-
Suggest default location: file-handling-preferences.md in project root
-
After user confirmation, create file and record location in claude.md
Preferences file structure:
-
Processing pattern records (by file type and content category)
-
Auto processing settings (whether to allow saving without confirmation)
-
Historical processing records
- File Type Detection
Detect file type: Determine processing method based on file extension:
Type Extension Processing Tool
Email .eml mu view <filepath>
PDF .pdf markitdown <filepath>
Word .docx markitdown <filepath>
PowerPoint .pptx markitdown <filepath>
Excel .xlsx markitdown <filepath>
Image .jpg, .png, .gif, .webp, .bmp Read tool (language model direct read)
Audio .mp3, .wav, .m4a, .aac, .ogg Ask user
Video .mp4, .mov, .avi, .webm Ask user
Tool availability check:
-
Check if required tools are installed before execution
-
If mu not installed: Prompt Please install maildir-utils: sudo apt install maildir-utils
-
If markitdown not installed: Prompt Please install markitdown: pip install markitdown
- Content Extraction
Email (.eml):
mu view <filepath>
Extract: sender, recipient, subject, date, body
PDF/Office documents:
markitdown <filepath>
Convert to markdown format
Images: Use Read tool to directly read image, let language model analyze content:
-
Identify image subject
-
Extract text (if any)
-
Describe image content
Audio/Video:
-
Ask user for suggested processing method
-
Possible options:
-
Record file metadata only
-
Use external tool for transcription
-
Record manual summary
-
Record user's chosen processing method in project claude.md
- Content Analysis
Analyze content:
-
Identify topics and keywords
-
Determine content type (technical, personal, work, academic, etc.)
-
Extract important information (dates, people, places, events)
Infer user intent:
-
Archive for storage (long-term preservation)
-
Project update (related to existing project)
-
Record memo (personal notes)
-
Data organization (batch processing)
Match against preferences:
-
Check if preferences file has matching patterns
-
If historical records exist, prioritize suggesting same processing method
- Directory Discovery
Use akashicrecords mechanism:
-
Based on content analysis results, build search query
-
Scan knowledge base directory structure
-
Read each directory's RULE.md to understand purpose
-
Evaluate content-to-directory purpose match
Suggestion logic:
-
Technical document + directory purpose "research" → high match
-
Email + directory purpose "communications" → high match
-
Personal photo + directory purpose "personal life" → high match
-
No clear match → suggest Miscellaneous or ask user
- User Confirmation
Present analysis results:
File Analysis Results
File: [filename] Type: [file type] Content Summary: [brief summary]
Inferred Intent: [archive/update/record]
Suggested Location: [target directory path] Reason: [why this location was chosen]
Planned Operation:
- Call add-content skill
- Format: [according to RULE.md]
- Filename: [suggested filename]
Do you approve this operation?
Wait for confirmation:
-
Default requires user approval
-
If auto_save: true in preferences, can skip confirmation
-
User can modify suggested location or cancel
- Execute
Call corresponding akashicrecords skill:
-
Add new content → add-content skill
-
Update existing → update-content skill
Format according to target RULE.md:
-
Read target directory's RULE.md
-
Follow naming conventions
-
Apply frontmatter format (if required)
- Update Preferences
Record this processing experience:
[Date] [File Type]
- Content characteristics: [key features]
- Target location: [actual storage location]
- Processing method: [skill used]
Learning pattern:
-
Accumulate user preferences
-
Prioritize suggesting same method for similar content next time
Multi-File Processing
When user provides multiple files:
Parallel Analysis
-
Launch a subagent for each file
-
Each subagent independently executes Phase 2-5
-
Wait for all subagents to complete
Consolidated Presentation
Multi-File Processing Analysis Results
| # | Filename | Type | Content Summary | Suggested Location | Operation |
|---|---|---|---|---|---|
| 1 | file1.pdf | [summary] | Research/ | add-content | |
| 2 | photo.jpg | Image | [summary] | Personal/ | add-content |
| 3 | email.eml | [summary] | Work/ | add-content |
Please choose:
- Approve all
- Confirm individually
- Cancel
Batch Execution
-
After user approves all, execute sequentially
-
When user confirms individually, confirm each file separately
Error Handling
Tool Not Installed
Warning: Cannot process .eml file: mu tool not installed Please run: sudo apt install maildir-utils
Unsupported File Format
Warning: Unsupported file format: .xyz How would you like to proceed?
- Try reading as plain text
- Record file metadata only
- Skip this file
Parse Failure
Warning: Unable to parse file content Error: [error message] How would you like to proceed?
- Retry
- Enter summary manually
- Skip this file
Integration with Governance
Before operation:
-
Read preferences file
-
Confirm akashicrecords governance structure exists
During operation:
-
Use akashicrecords skills for actual operations
-
Follow target directory's RULE.md
After operation:
-
Update preferences file
-
akashicrecords skills automatically handle README.md updates
Examples
Example 1: Process PDF Paper
User: "Read ~/Downloads/transformer-paper.pdf"
Workflow:
-
Check preferences → Find historical record "technical paper → Research/Papers/"
-
Detect .pdf → Use markitdown
-
Execute markitdown ~/Downloads/transformer-paper.pdf
-
Analyze content → AI/machine learning topic
-
Match preferences → Matches "technical paper" pattern
-
Suggest Research/Papers/AI/
-
User confirms
-
Call add-content skill
-
Update preferences file
Example 2: Batch Process Emails
User: "Archive these emails: email1.eml email2.eml email3.eml"
Workflow:
-
Detect multiple files → Launch 3 subagents
-
Each subagent processes in parallel:
-
Parse using mu view
-
Analyze sender, subject, content
-
Suggest target location
-
Consolidate results into list
-
User selects "Approve all"
-
Execute add-content sequentially
-
Update preferences
Example 3: Process Image
User: "Process this photo ~/Photos/vacation.jpg"
Workflow:
-
Detect .jpg → Use Read tool
-
Language model analyzes image content → "Beach vacation photo"
-
Check preferences → Find "travel photos → Personal/Travel/"
-
Suggest Personal/Travel/2025/
-
User confirms
-
Call add-content (convert to descriptive markdown)
-
Update preferences
Best Practices
-
Always check preferences first - Prioritize historical processing patterns
-
Confirm before saving - Default requires user approval
-
Update preferences after success - Accumulate learning user preferences
-
Use parallel processing - Leverage subagents for multiple files
-
Handle errors gracefully - Provide alternatives
-
Integrate with akashicrecords - Use existing skills for operations
Notes
-
Preferences file path is recorded in project claude.md
-
Each project can have different preferences
-
Audio/video processing methods are recorded in claude.md
-
This Skill does not modify files directly, operates through akashicrecords skills