docx-analysis

Analyze the structure and content of .docx (Word) files using MinerU. Returns structured Markdown preserving headings, tables, lists, and layout for content understanding. Features: structural analysis of .docx files. Quick analysis mode (flash-extract) without token. Full analysis with token for complex documents with tables and formulas. Page range selection for large files. Use when you need to: analyze a .docx file structure, understand content layout of a Word document, inspect .docx headings and sections, get an overview of a Word file. Use when asked: 'how do I analyze this docx', 'what is inside this Word file', 'I want to understand this .docx structure', 'can my agent analyze docx files', 'break down this docx for me'. Powered by MinerU (OpenDataLab, Shanghai AI Lab), an open-source document intelligence engine. Supports multilingual content. Ideal for content reviewers, editors, and automated quality checks on Word documents.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "docx-analysis" with this command: npx skills add mzlzyca/docx-analysis

Docx Analysis

Analyze the structure and content of Word (.docx) files using MinerU. Returns structured Markdown with headings, tables, and layout preserved.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Quick analysis (no token required)
mineru-open-api flash-extract report.docx

# Save to directory
mineru-open-api flash-extract report.docx -o ./out/

# Full analysis with tables and formulas (requires token)
mineru-open-api extract report.docx -o ./out/

# With language hint
mineru-open-api flash-extract report.docx --language en

Authentication

No token needed for flash-extract. Token required for extract:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: .docx (local file or URL)
  • flash-extract: no token, max 10 MB / 20 pages, Markdown output
  • extract: token required, full features (tables, formulas, multi-format output)
  • Language hint with --language (default: ch, use en for English)
  • Page range with --pages (e.g. 1-10)

Notes

  • For .doc (legacy format), use doc-analysis skill instead
  • Output goes to stdout by default; use -o <dir> to save to a file or directory
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Ephemeral Media Hosting

自動削除機能付き一時メディアホスティングシステム

Registry SourceRecently Updated
General

Ethereum Read Only

Foundry castを使用したウォレット不要のオンチェーン状態読み取り

Registry SourceRecently Updated
General

OpenClaw Memory

Manage, optimize, and troubleshoot the OpenClaw memory system — MEMORY.md curation, daily logs (memory/YYYY-MM-DD.md), memory_search tuning, compaction survi...

Registry SourceRecently Updated
General

ImageRouter

Generate AI images with any model using ImageRouter API (requires API key).

Registry SourceRecently Updated
2.6K2dawe35