markitdown

Convert a wide variety of file formats into Markdown text using Microsoft's markitdown CLI. Useful for extracting text from documents for LLM analysis, summarization, or ingestion into knowledge bases.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "markitdown" with this command: npx skills add yutakobayashidev/dotnix/yutakobayashidev-dotnix-markitdown

MarkItDown

Purpose

Convert a wide variety of file formats into Markdown text using Microsoft's markitdown CLI. Useful for extracting text from documents for LLM analysis, summarization, or ingestion into knowledge bases.

Supported Formats

Category Formats

Documents PDF, DOCX, PPTX, XLSX, XLS

Web & Data HTML, CSV, JSON, XML

Media Images (EXIF + OCR), Audio (metadata + transcription)

eBooks EPub

Archives ZIP (iterates over contents)

Other YouTube URLs, Outlook messages

Basic Usage

Convert a file (output to stdout)

markitdown path/to/file.pdf

Save output to a file

markitdown path/to/file.pdf -o output.md

Pipe from stdin

cat path/to/file.pdf | markitdown

Options

Flag Description

-o <file>

Write output to a file instead of stdout

-d

Use Azure Document Intelligence for conversion

-e "<endpoint>"

Azure Document Intelligence endpoint URL

--use-plugins

Enable third-party plugins

--list-plugins

Show installed plugins

Workflow

Single File Conversion

Convert and capture the result

result=$(markitdown document.pdf)

Convert and save

markitdown document.pdf -o document.md

Batch Conversion

Convert all PDFs in a directory

for f in *.pdf; do markitdown "$f" -o "${f%.pdf}.md" done

Pipe into Other Tools

Convert and count words

markitdown document.pdf | wc -w

Convert and search for a term

markitdown document.pdf | grep -i "search term"

Agent Usage Notes

  • Output goes to stdout by default. Capture it in a variable or redirect to a file.

  • For large files, prefer saving to a file with -o rather than capturing stdout.

  • Image conversion extracts EXIF metadata and OCR text. For richer image descriptions, use the Python API with an LLM client instead.

  • ZIP files are automatically extracted and each contained file is converted.

  • If conversion fails for a format, check that the corresponding optional dependency is installed (e.g., markitdown[pdf] for PDF support).

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

markitdown

No summary provided by upstream source.

Repository SourceNeeds Review
General

markitdown

No summary provided by upstream source.

Repository SourceNeeds Review
General

markitdown

No summary provided by upstream source.

Repository SourceNeeds Review