doc-miner

Extract summaries, answers, or structured data from any URL, PDF, or raw text. Auto-detects mode from task.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "doc-miner" with this command: npx skills add unixlamadev-spec/doc-miner

Doc Miner

Extract insights, answers, and structured data from PDFs, webpages, or raw text. Auto-detects the right mode from your task: summarization, Q&A, or structured extraction of entities, dates, and numbers.

When to Use

Summarizing long PDFs or articles
Answering questions about document contents
Extracting named entities, dates, or figures
Analyzing raw text without a URL
Research and literature review

Usage Flow

Provide a url (PDF or webpage) or paste text directly
Optionally specify a task — asking a question triggers Q&A mode; "extract" triggers extraction mode; default is summarization
AIProx routes to the doc-miner agent
Returns mode-specific fields: summary/key_points/word_count, or answer/context/confidence, or entities/dates/numbers

Security Manifest

Permission	Scope	Reason
Network	aiprox.dev	API calls to orchestration endpoint
Env Read	AIPROX_SPEND_TOKEN	Authentication for paid API

Make Request

curl -X POST https://aiprox.dev/api/orchestrate \
  -H "Content-Type: application/json" \
  -H "X-Spend-Token: $AIPROX_SPEND_TOKEN" \
  -d '{
    "task": "extract all dates and key entities",
    "text": "On January 15, 2024, Acme Corp announced a merger with GlobalTech valued at $2.4 billion..."
  }'

Response (extraction mode)

{
  "mode": "extraction",
  "key_points": ["Acme Corp merging with GlobalTech", "Deal valued at $2.4 billion"],
  "entities": ["Acme Corp", "GlobalTech"],
  "dates": ["January 15, 2024"],
  "numbers": ["$2.4 billion"],
  "source_type": "text"
}

Response (summary mode)

{
  "mode": "summary",
  "summary": "Q3 2024 product analytics report covering user metrics and strategic recommendations.",
  "key_points": ["User engagement up 23%", "Mobile conversion 40% below desktop"],
  "word_count": 1240,
  "source_type": "webpage"
}

Trust Statement

Doc Miner fetches and analyzes document contents via URL or processes provided text. Documents are processed transiently and not stored. Analysis is performed by Claude via LightningProx. Your spend token is used for payment only.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

qwencloud-model-selector

[QwenCloud] Recommend the best Qwen model and parameters. TRIGGER when: choosing between Qwen models, comparing Qwen model pricing, understanding Qwen model...

Registry SourceRecently Updated

1290cuixiaoyang123

General

deployment-manager

You are a deployment manager with expertise in release orchestration, deployment strategies, and production reliability. Use when: release orchestration and...

Registry SourceRecently Updated

250mtsatryan

General

Hk Stock Morning Report

Generate HK stock market morning report (股市晨報) for bank trading desks. Triggers: "生成晨报", "股市晨报", "今日股市", "港股晨報" 報告結構（5部分）： 1. 市場回顧（恒指/科指/國指 + 強弱勢股） 2. 南下資金（總...

Registry SourceRecently Updated

4100cjlrestlong-ai

General

Story Long Scan

长篇网文扫榜。分析起点、番茄、晋江等平台排行榜数据，提炼市场趋势与热门题材。触发方式：/story-long-scan、/长篇扫榜、「长篇什么火」「起点排行」

Registry SourceRecently Updated

170worldwonderer