doc-miner

Extract summaries, answers, or structured data from any URL, PDF, or raw text. Auto-detects mode from task.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "doc-miner" with this command: npx skills add unixlamadev-spec/doc-miner

Doc Miner

Extract insights, answers, and structured data from PDFs, webpages, or raw text. Auto-detects the right mode from your task: summarization, Q&A, or structured extraction of entities, dates, and numbers.

When to Use

  • Summarizing long PDFs or articles
  • Answering questions about document contents
  • Extracting named entities, dates, or figures
  • Analyzing raw text without a URL
  • Research and literature review

Usage Flow

  1. Provide a url (PDF or webpage) or paste text directly
  2. Optionally specify a task — asking a question triggers Q&A mode; "extract" triggers extraction mode; default is summarization
  3. AIProx routes to the doc-miner agent
  4. Returns mode-specific fields: summary/key_points/word_count, or answer/context/confidence, or entities/dates/numbers

Security Manifest

PermissionScopeReason
Networkaiprox.devAPI calls to orchestration endpoint
Env ReadAIPROX_SPEND_TOKENAuthentication for paid API

Make Request

curl -X POST https://aiprox.dev/api/orchestrate \
  -H "Content-Type: application/json" \
  -H "X-Spend-Token: $AIPROX_SPEND_TOKEN" \
  -d '{
    "task": "extract all dates and key entities",
    "text": "On January 15, 2024, Acme Corp announced a merger with GlobalTech valued at $2.4 billion..."
  }'

Response (extraction mode)

{
  "mode": "extraction",
  "key_points": ["Acme Corp merging with GlobalTech", "Deal valued at $2.4 billion"],
  "entities": ["Acme Corp", "GlobalTech"],
  "dates": ["January 15, 2024"],
  "numbers": ["$2.4 billion"],
  "source_type": "text"
}

Response (summary mode)

{
  "mode": "summary",
  "summary": "Q3 2024 product analytics report covering user metrics and strategic recommendations.",
  "key_points": ["User engagement up 23%", "Mobile conversion 40% below desktop"],
  "word_count": 1240,
  "source_type": "webpage"
}

Trust Statement

Doc Miner fetches and analyzes document contents via URL or processes provided text. Documents are processed transiently and not stored. Analysis is performed by Claude via LightningProx. Your spend token is used for payment only.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

low-carbon-medicine

低碳生活方式医学咨询。当用户提到低碳饮食、生酮饮食、减肥控糖、糖尿病逆转、代谢综合征、胰岛素抵抗时触发。

Registry SourceRecently Updated
General

x0x-api-smoketest-1777556197822

Scratch skill used to validate CI API publish flow before merge.

Registry SourceRecently Updated
General

java-circular-dependency-breaker

Break circular dependencies in Java multi-module Gradle/Maven projects using interface extraction and business service separation. Triggers: 'circular depend...

Registry SourceRecently Updated
General

Options Trading Brain

Professional options trading intelligence system. Monitors whale flow (Unusual Whales), counts Elliott Waves, analyzes Bollinger Bands, multi-timeframe trend...

Registry SourceRecently Updated