tex2docx

Convert LaTeX (.tex) academic papers to Word (.docx) with editable OMML equations, native Word tables, embedded figures, IEEE two-column format, and bibliography. Use when a user provides a .tex file and asks for a Word/DOCX version, or when converting academic LaTeX papers to editable Office format.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tex2docx" with this command: npx skills add wsyummy/tex2docx

tex2docx — LaTeX to Word Converter

Requirements

  • pandoc (system install): winget install pandoc or pandoc.org
  • Python packages: pip install python-docx lxml pypandoc_binary

Usage

python scripts/tex2docx.py input.tex [output.docx]

If output.docx is omitted, output is input.docx in the same directory.

How It Works (Three Phases)

.tex ──→ [pandoc] ──→ OMML equations (13+ Word-editable formulas)
  │
  └──→ [Custom parser] ──→ Native Word tables ├──→ Final .docx
                           Embedded figures     │   (merged)
                           Formatted refs       │
                           IEEE layout & font  ┘

Phase 1 — Pandoc

Runs pandoc via pypandoc. Input file must be in its own directory (with figures/ subfolder if images exist). The script chdirs to the tex directory before running pandoc so image paths resolve correctly.

Phase 2 — Custom LaTeX Parser

RegEx-based extraction of:

  • Tables: \begin{table} → Word Table objects (full borders, centered, 8pt TNR)
  • Figures: \includegraphics{} + \caption{} → PNG/PDF embeds with italic captions
  • References: \thebibliography → formatted entries with hanging indent
  • Sections: \section{}, \subsection{} → bold headings
  • Metadata: \title, author, \abstract, \IEEEkeywords

Phase 3 — Merge

OMML equation paragraphs from pandoc are inserted into the cleanly-built document. Body paragraphs get 0.25in first-line indent. All LaTeX commands (\textbf, \toprule, \ref, \cite, \begin{itemize}, etc.) are stripped from text content.

Output Format

FeatureDetail
FontTimes New Roman (10pt body, 9pt table/figure, 8pt refs)
LayoutA4, two-column IEEE conference style
EquationsOMML (double-click to edit in Word)
TablesNative Word tables, all borders
FiguresPNG/PDF embedded with "Fig." captions
ReferencesHanging indent, [bN] format
First indent0.25in on body paragraphs

Verification

python scripts/verify.py output.docx

Reports paragraph/table/image/equation counts and checks for LaTeX residue.

Chinese (ctex) Support

Fully supports Chinese LaTeX documents using the ctex package:

  • Chinese section titles (引言, 方法, 实验, 结论等) are recognized
  • \section*{} (star variant) is supported
  • Chinese table headers preserved
  • Chinese text in titles rendered via w:eastAsia font fallback
  • \title{...} and \author{...} residue paragraphs are filtered

Limitations

  • Inline math ($...$) becomes plain text (italic), not OMML — only \begin{equation}, \begin{align}, and \[...\] become editable equations
  • No .bib support: references must be in \thebibliography{} environment
  • PNG images preferred: script tries PNG then PDF fallback
  • Pandoc path: the system pandoc binary must be discoverable by pypandoc

Script: scripts/tex2docx.py

Self-contained (660+ lines). Key internal functions:

FunctionRole
extract_tex()Parse all structural elements from .tex
extract_omml()Pull OMML XML from pandoc output
build_docx()Construct final document with all components
clean()Strip LaTeX commands to plain text
add_table()Build Word table with borders
add_figure()Embed image + caption

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Biomarker Investigation

Search the academic and patent literatures related to the biomarkers, based on the queries Load the skill when the queries are about - Refer a specific bioma...

Registry SourceRecently Updated
Research

孩子学习行为分析工具

Conducts video analysis of learning behavior for children/students, identifies poor learning habits, provides structured analysis reports and family educatio...

Registry SourceRecently Updated
Research

UUMuse Brain

Access, search, manage, and retrieve information from your UUMuse uploaded documents, knowledge bases, and long-term memory across sessions.

Registry SourceRecently Updated
Research

Autoresearch.Bak

Autonomous experiment loop for AI agents. Use when the user wants to run systematic experiments — optimizing hyperparameters, searching for better configurat...

Registry SourceRecently Updated