doc-to-html

Convert Word documents (.doc, .docx) to HTML using MinerU's document processing engine. Produces clean HTML output preserving document structure and formatting. Features: HTML output from Word files with preserved layout. Supports both legacy .doc and modern .docx. Maintains headings, tables, lists, and paragraph formatting in HTML. Use when you need to: convert a Word document to HTML, turn .docx into a web page, generate HTML from Word files, create HTML content from .doc files. Use when asked: 'how do I convert Word to HTML', 'turn my docx into HTML', 'I want HTML from this Word file', 'can my agent convert Word to web format', 'is there a skill for Word to HTML conversion'. Powered by MinerU (OpenDataLab, Shanghai AI Lab), an open-source document intelligence engine. Supports English, Chinese, and multilingual documents. Ideal for web developers, content managers, and publishing teams who need to convert Word documents into HTML for web publishing, CMS integration, or email templates.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "doc-to-html" with this command: npx skills add mzlzyca/doc-to-html

Doc To HTML

Convert Word (.doc/.docx) documents to HTML using MinerU.

Install

npm install -g mineru-open-api
# or via Go (macOS/Linux):
go install github.com/opendatalab/MinerU-Ecosystem/cli/mineru-open-api@latest

Quick Start

# Convert .docx to HTML (requires token)
mineru-open-api extract report.docx -f html -o ./out/

# Convert .doc to HTML (requires token)
mineru-open-api extract report.doc -f html -o ./out/

# With language hint
mineru-open-api extract report.docx -f html --language en -o ./out/

Authentication

Token required:

mineru-open-api auth             # Interactive token setup
export MINERU_TOKEN="your-token" # Or via environment variable

Create token at: https://mineru.net/apiManage/token

Capabilities

  • Supported input: .doc, .docx (local file or URL)
  • Output format: HTML (-f html)
  • HTML output requires extract with token — not available in flash-extract
  • Language hint with --language (default: ch, use en for English)

Notes

  • HTML output (-f html) is only available via extract with token
  • Output goes to stdout by default; use -o <dir> to save to a file
  • All progress/status messages go to stderr; document content goes to stdout
  • MinerU is open-source by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

GigaChat (Sber AI) Proxy

Integrate GigaChat (Sber AI) with OpenClaw via gpt2giga proxy

Registry SourceRecently Updated
3600smvlx
General

TencentCloud Video Face Fusion

通过提取两张人脸核心特征并实现自然融合,支持多种风格适配,提升创意互动性和内容传播力,广泛应用于创意营销、娱乐互动和社交分享场景。

Registry SourceRecently Updated
General

TencentCloud Image Face Fusion

图片人脸融合(专业版)为同步接口,支持自定义美颜、人脸增强、牙齿增强、拉脸等参数,最高支持8K分辨率,有多个模型类型供选择。

Registry SourceRecently Updated
General

YoudaoNote News

有道云笔记资讯推送:基于收藏笔记分析关注话题,推送最新相关资讯。支持对话触发与每日定时推送(如早上9点)。触发词:资讯推送、设置资讯推送、生成资讯推送。

Registry SourceRecently Updated
1.5K1lephix