camscanner-pdf2office

Use CamScanner to convert PDF documents to editable Word (.docx) or Excel (.xlsx) format, with intelligent content recognition and accurate format preservation. Triggers on "PDF to Word", "PDF to Excel", "convert PDF to docx", "convert PDF to xlsx", or when the user has a PDF and needs it as an editable Office document.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "camscanner-pdf2office" with this command: npx skills add CamScanner/camscanner-pdf2office-office

CamScanner PDF to Office

Overview

CamScanner provides document conversion capabilities that convert PDF documents to Word or Excel documents while preserving original formatting. The workflow is a 3-step pipeline: upload the PDF, convert it, then download the result.

When to Use

  • User wants to convert a PDF to Word (.docx) or Excel (.xlsx)
  • User wants to make a PDF editable
  • User has a PDF and needs it as an Office document

Privacy & Data

Important: Privacy & Data Flow Notice

  • Third-party service: This skill sends your files to CamScanner's official servers (ai-tools.camscanner.com) for processing.
  • Data retention: CamScanner servers process your files in real-time. Files are not permanently stored on the server.
  • Local files: Output files are saved to your local filesystem at the path you specify.

API Reference

Base URL: https://ai-tools.camscanner.com

Supported Conversions

source_typetarget_typeOutput
pdfword.docx
pdfexcel.xlsx

Step 1: Upload PDF

BASE="https://ai-tools.camscanner.com"

IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
  -H "Content-Type: application/octet-stream" \
  --data-binary "@/path/to/document.pdf" | jq -r '.tool_result.data.file_id')

Response:

{
  "code": 200,
  "tool": "upload_file",
  "tool_result": {
    "success": true,
    "data": {
      "file_id": "file_1741857600_ab12cd34ef56",
      "size": 24576
    }
  }
}

Step 2: Convert PDF

OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"TARGET\",\"output_mode\":\"file_id\"}" \
  | jq -r '.tool_result.data.file_id')

Replace TARGET with one of: word, excel.

Response:

{
  "code": 200,
  "tool": "convert_pdf",
  "tool_result": {
    "success": true,
    "data": {
      "file_id": "file_1741857722_ddeeff001122",
      "target_type": "word"
    }
  }
}

Step 3: Download Result

curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$OUT_FILE_ID\"}" \
  -o /path/to/output.docx

Critical: The response_mode=raw query parameter is required to get the binary file. Without it, the response is JSON.

Quick Reference: Complete Pipeline

BASE="https://ai-tools.camscanner.com"
INPUT_PDF="/path/to/document.pdf"
TARGET_TYPE="word"          # word | excel
OUTPUT_FILE="/path/to/output.docx"

# Upload
IN_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/upload_file/execute" \
  -H "Content-Type: application/octet-stream" \
  --data-binary "@$INPUT_PDF" | jq -r '.tool_result.data.file_id')

# Convert
OUT_FILE_ID=$(curl -sS -X POST "$BASE/v1/tools/convert_pdf/execute" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$IN_FILE_ID\",\"source_type\":\"pdf\",\"target_type\":\"$TARGET_TYPE\",\"output_mode\":\"file_id\"}" \
  | jq -r '.tool_result.data.file_id')

# Download
curl -sS -X POST "$BASE/v1/tools/download_file/execute?response_mode=raw" \
  -H "Content-Type: application/json" \
  -d "{\"file_id\":\"$OUT_FILE_ID\"}" \
  -o "$OUTPUT_FILE"

File Extension Mapping

target_typeExtension
word.docx
excel.xlsx

Common Mistakes

MistakeFix
Forgetting response_mode=raw on downloadAlways append ?response_mode=raw to the download URL
Wrong Content-Type on uploadUpload uses application/octet-stream, not multipart/form-data
Using GET instead of POSTAll three endpoints use POST
Missing source_type in convert requestAlways include "source_type": "pdf"
Missing output_mode in convert requestAlways include "output_mode": "file_id" to get a downloadable file_id
Wrong output extensionMatch extension to target_type (see table above)

Error Handling

Check each step before proceeding:

# After upload
if [ -z "$IN_FILE_ID" ] || [ "$IN_FILE_ID" = "null" ]; then
  echo "Upload failed"; exit 1
fi

# After convert
if [ -z "$OUT_FILE_ID" ] || [ "$OUT_FILE_ID" = "null" ]; then
  echo "Conversion failed"; exit 1
fi

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

通义晓蜜 - 智能外呼

触发阿里云晓蜜外呼机器人任务,自动批量拨打电话。适用于批量外呼、客户回访、满意度调查、简历筛查约面试等场景。可从前置工具或节点获取外呼名单。

Registry SourceRecently Updated
General

Letterboxd Watchlist

Scrape a public Letterboxd user's watchlist into a CSV/JSONL list of titles and film URLs without logging in. Use when a user asks to export, scrape, or mirror a Letterboxd watchlist, or to build watch-next queues.

Registry SourceRecently Updated
General

Seedance Video Generation

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame, first+last frame, reference images), or (3) query/manage video generation tasks. Supports Seedance 1.5 Pro (with audio), 1.0 Pro, 1.0 Pro Fast, and 1.0 Lite models.

Registry SourceRecently Updated
4.2K17jackycser
General

Universal Skills Manager

The master coordinator for AI skills. Discovers skills from multiple sources (SkillsMP.com, SkillHub, and ClawHub), manages installation, and synchronization...

Registry SourceRecently Updated