paper-cluster-survey-v2-2

Extract structured paper records from one or more local PDFs, arXiv links, DOI links, or general paper URLs, then classify the papers and write an academic survey review. Use when the task involves mixed paper sources, URL-first literature collection, PDF-based review drafting, taxonomy building, or producing a formal literature review from a paper set. By default, provide one classification table and one integrated review for the full corpus; only write separate reviews for each category when the user explicitly asks for per-category reviews.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "paper-cluster-survey-v2-2" with this command: npx skills add huang888596/paper-cluster-survey-v2-2

Paper Cluster Survey V2.2

Overview

Turn raw paper URLs and PDFs into usable review inputs. Extract structured metadata and text evidence first, then classify the papers, produce a classification table, and write a review that follows common academic survey conventions instead of a rigid fill-in-the-blanks template.

Workflow

1. Normalize the source set

  • Accept multiple local PDF paths, arXiv URLs, DOI URLs, and general paper URLs.
  • Use scripts/normalize-sources.mjs when the source set is mixed or should be stored as a reusable manifest.
  • Preserve the original source string for traceability.

2. Extract paper records before reasoning

  • Use scripts/extract-paper-records.mjs to turn PDFs and URLs into structured records before classification.
  • The extraction pass should gather as much of the following as possible:
    • title
    • authors
    • year
    • venue
    • abstract
    • task
    • method
    • datasets
    • metrics
    • main_contribution
    • limitations
    • source
    • extraction_notes
  • Treat extracted records as the primary context for classification and survey drafting.
  • If important fields are missing, only fall back to direct source reading for the specific missing details.

Read extraction-pipeline.md when deciding how much to trust the extracted fields and when to re-open the raw source.

3. Verify evidence quality

  • Do not classify from titles alone when abstract or body text is available.
  • Prefer abstract, introduction, and method section.
  • Mark uncertain inferences explicitly.
  • If the extractor had to fall back to weak methods, keep claims conservative.

4. Design the classification scheme

  • Produce a classification scheme before writing the review.
  • Prefer task-based categories first.
  • If tasks are too similar, classify by method family.
  • Use application-domain categories only when they best explain the corpus.
  • Keep the taxonomy shallow unless the corpus is large.
  • Assign one primary category to each paper unless the user explicitly wants multi-label grouping.

Read taxonomy-guidelines.md when the category design is ambiguous.

5. Output the classification table

  • Always provide one classification table before the review.
  • Include:
    • paper
    • year
    • category
    • rationale
    • evidence used
  • Keep rationales brief and evidence-based.

6. Decide the review shape

Default rule:

  • Write one integrated literature review for the entire corpus after the classification table.

Exception:

  • If the user explicitly asks for "each category write a survey", "分别写综述", "per-category review", or equivalent, write separate review sections for each category.

7. Write the review in academic survey style

The review must read like a normal survey paper, not a bullet summary dump.

  • Use a concise academic title.
  • Include an abstract when the output is formal enough to justify it.
  • Include keywords when they help position the review.
  • Use an introduction that explains background, significance, scope, source selection, and the organizing logic of the review.
  • Organize the main body by the most defensible basis for the corpus:
    • chronology
    • research themes
    • method families
    • viewpoints or schools
  • End with a conclusion or concluding discussion.
  • Add future directions, outlook, or open problems when the corpus supports them.
  • List references in GB/T 7714 style when bibliographic data is available.

Typical sections in a strong review are:

  • title
  • abstract
  • keywords
  • introduction
  • themed main sections
  • discussion, conclusion, or both
  • future directions or open problems when useful
  • references

Not every output needs every section. Match the structure to the user's request, the corpus size, and the field while staying recognizably review-like.

Read review-paper-style.md when drafting the prose review or choosing section structure.

8. Keep the prose review-like

  • Prefer connected academic prose over bullet dumps.
  • Use tables only to support comparison, not replace the review.
  • Do not invent datasets, metrics, or reference details.
  • If extracted metadata is incomplete, keep partial references and state what is missing.

Output Contract

Return results in this order unless the user asks otherwise:

  1. Corpus summary
  2. Classification scheme
  3. Classification table
  4. Formal review article
  5. References

If the user wants structured output, read output-schema.md.

Bundled Scripts

scripts/normalize-sources.mjs

  • Normalize mixed PDF and URL inputs into a JSON manifest.
  • Use when the source set is large, mixed, or should be reused.

scripts/extract-paper-records.mjs

  • Fetch URLs, resolve likely paper metadata, and extract paper text evidence from URLs or PDFs.
  • Prefer this script before asking the model to reason over a large source set.
  • Use its output as the main context object for classification and review drafting.

scripts/render-formal-review-template.mjs

  • Render a flexible academic-review scaffold from structured paper records.
  • Default to one integrated review.
  • Use --per-category only when the user explicitly asks for separate category reviews.

Quality Bar

  • Run extraction before classification unless the user already gave structured paper records.
  • Keep classification and review consistent with extracted evidence.
  • Use raw source re-reading only to fill important gaps.
  • If the extractor had to rely on weak fallbacks, say so.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Gougoubi Arena Trade

Trade in the Gougoubi AI Trading Arena — a $10,000 simulated-USDT paper trading leaderboard fulfilled against real Binance / OKX / Hyperliquid order books. A...

Registry SourceRecently Updated
Research

Thinkdeep

Structured reasoning protocol for Claude — forces step-by-step analysis, self-critique, and confidence scoring before answering. Reduces wrong answers and ha...

Registry SourceRecently Updated
Research

股票实时行情分析器

A股/港股实时行情查询、基本面分析、深度报告生成与邮件发送一体化工具。触发场景:(1) 用户询问股票价格、市值、PE/PB等数据;(2) 用户要求分析某只或多只股票;(3) 用户要求生成股票分析报告;(4) 用户要求通过邮件发送股票报告。支持AkShare实时行情、聚宽基本面数据、QQ邮箱/Gmail发送。

Registry SourceRecently Updated
260Profile unavailable
Research

Keep 健康记录

Use when users are stating or logging their own health data to Keep or Keep App rather than asking for advice, analysis, or general chat, including weight, b...

Registry SourceRecently Updated
300Profile unavailable