markdown-converter

Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube URLs, or EPubs to Markdown format for LLM processing or text analysis.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "markdown-converter" with this command: npx skills add steipete/markdown-converter

Markdown Converter

Convert files to Markdown using uvx markitdown — no installation required.

Basic Usage

# Convert to stdout
uvx markitdown input.pdf

# Save to file
uvx markitdown input.pdf -o output.md
uvx markitdown input.docx > output.md

# From stdin
cat input.pdf | uvx markitdown

Supported Formats

  • Documents: PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls)
  • Web/Data: HTML, CSV, JSON, XML
  • Media: Images (EXIF + OCR), Audio (EXIF + transcription)
  • Other: ZIP (iterates contents), YouTube URLs, EPub

Options

-o OUTPUT      # Output file
-x EXTENSION   # Hint file extension (for stdin)
-m MIME_TYPE   # Hint MIME type
-c CHARSET     # Hint charset (e.g., UTF-8)
-d             # Use Azure Document Intelligence
-e ENDPOINT    # Document Intelligence endpoint
--use-plugins  # Enable 3rd-party plugins
--list-plugins # Show installed plugins

Examples

# Convert Word document
uvx markitdown report.docx -o report.md

# Convert Excel spreadsheet
uvx markitdown data.xlsx > data.md

# Convert PowerPoint presentation
uvx markitdown slides.pptx -o slides.md

# Convert with file type hint (for stdin)
cat document | uvx markitdown -x .pdf > output.md

# Use Azure Document Intelligence for better PDF extraction
uvx markitdown scan.pdf -d -e "https://your-resource.cognitiveservices.azure.com/"

Notes

  • Output preserves document structure: headings, tables, lists, links
  • First run caches dependencies; subsequent runs are faster
  • For complex PDFs with poor extraction, use -d with Azure Document Intelligence

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Autoresearch.Bak

Autonomous experiment loop for AI agents. Use when the user wants to run systematic experiments — optimizing hyperparameters, searching for better configurat...

Registry SourceRecently Updated
Research

Clone Anywebsite

High-fidelity visual-first web rebuilding from design references. Screenshot-driven analysis, DOM interrogation for exact CSS values, asset inspection (WebGL...

Registry SourceRecently Updated
2120Profile unavailable
Research

jeffli-content-factory

Create complete WeChat Official Account viral articles from a user-provided title by researching high-view YouTube videos, confirming topic/outline with user...

Registry SourceRecently Updated
2450Profile unavailable
Research

Paperform

PaperForm integration. Manage Forms. Use when the user wants to interact with PaperForm data.

Registry SourceRecently Updated
2910Profile unavailable