wiseocr

PDF & Image OCR — Convert a single PDF or image to Markdown via WiseDiag cloud API, with high-accuracy text extraction, table recognition, and multi-column layout support. Usage: Upload a file and say Use WiseOCR to OCR this.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "wiseocr" with this command: npx skills add wisediag/wiseocr

⚠️ Privacy Warning

IMPORTANT - READ BEFORE INSTALLING:

This skill uploads your file to WiseDiag's cloud servers for OCR processing.

Do NOT use with sensitive or confidential documents unless:

  • You trust WiseDiag's data handling policies
  • You accept that file contents will be transmitted and processed remotely

🚫 Personal Sensitive Information Warning

Do NOT upload documents containing any of the following:

CategoryExamples
IdentityID card numbers, passport numbers, driver's license numbers
FinancialBank account numbers, credit card numbers, tax IDs
BiometricFingerprints, facial recognition data, voiceprints
CredentialsPasswords, PINs, security question answers
Personal contactHome address, personal phone number, personal email
Minor's infoAny information belonging to children under 18

If the document contains sensitive personal information, it is recommended to use a local/offline OCR solution.


WiseOCR Skill (powered by WiseDiag)

A high-accuracy OCR tool that converts a single PDF or image file into Markdown format. After processing, the Markdown result is automatically saved to disk — no additional saving is needed.

Supported formats: PDF, jpg, jpeg, png, webp, gif, bmp, tiff.

Installation

pip install -r requirements.txt

🔑 API Key Setup (Required)

Get your API key: 👉 https://console.wisediag.com/apiKeyManage

The API key MUST be set as an environment variable. The script reads it automatically.

export WISEDIAG_API_KEY=your_api_key

How to Process a File (Step-by-Step)

NEVER call any API or HTTP endpoint directly. ONLY use the script below.

Step 1: Set the API key (if not already set):

export WISEDIAG_API_KEY=your_api_key

Step 2: Run the script with the input file:

cd scripts

# Single PDF
python3 wiseocr.py -i "/path/to/input_filename.pdf"

# Single image
python3 wiseocr.py -i "/path/to/scan.png"

IMPORTANT: If the input file has been copied or renamed (e.g. to a temp path), always pass -n with the original filename (without extension) so the output file is named correctly:

python3 wiseocr.py -i "/tmp/ocr_input.pdf" -n "my_report"
# Output saved to: ~/.openclaw/workspace/WiseOCR/my_report.md

The Markdown result is saved to ~/.openclaw/workspace/WiseOCR/{name}.md automatically. No additional saving is needed.

Arguments

FlagDescription
-i, --inputInput file: PDF or image — single file path (required)
-n, --nameOutput filename stem (recommended when input file is renamed/copied)
-o, --outputOutput directory (default: ~/.openclaw/workspace/WiseOCR)
--dpiPDF rendering DPI, 72-600 (default: 200)

Data Privacy

What happens to your files:

  1. Files are uploaded to WiseDiag's OCR API
  2. Files are processed on WiseDiag servers
  3. Processing results are returned to you
  4. Files are not permanently stored on WiseDiag servers

⚠️ Do NOT upload documents containing personal sensitive information, including but not limited to:

  • Identity information: ID card numbers, passport numbers, driver's license numbers
  • Financial information: Bank account/card numbers, credit card numbers, tax IDs, financial statements
  • Biometric data: Fingerprints, facial images, iris scans, voiceprints
  • Account credentials: Passwords, PINs, security question answers, private keys
  • Personal contact details: Home address, personal phone number, personal email
  • Information about minors: Any personally identifiable information belonging to children under 18

For documents containing the above, use offline/local OCR tools instead.

License

MIT

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

CamScanner Remove PDF Watermark

Use CamScanner to remove watermarks from PDF documents while preserving the underlying text, images, and original layout. Powered by a high-precision documen...

Registry SourceRecently Updated
General

CamScanner Remove Image Watermark

Use CamScanner to remove watermarks from images while preserving the underlying content and original layout. Powered by a high-precision image enhancement en...

Registry SourceRecently Updated
General

TikTok Post Search API

Call GET /api/tiktok/search-post/v1 for TikTok Post Search through JustOneAPI with keyword.

Registry SourceRecently Updated
General

Skill Linkedin

Use when user needs to generate personalized LinkedIn connection requests or outreach messages. Use when assisting LinkedIn lead generation and engagement. U...

Registry SourceRecently Updated