research-paper-figure-skill-factory

Use when the user wants a research-paper figure Skill Factory: build, patch, package, or use reusable specialized paper-figure-making skills from lawful literature/corpus evidence. Generated skills must use a specialized-skill-first workflow, full-feasible local PDF coverage where available, startup-plan-only first replies, strict text/image turn separation, ChatGPT web Create image / ChatGPT Images 2.0 rendering, Codex $imagegen-first rendering, sample-image transfer rules, all-step/current-position state footers, and a mandatory text-candidate to visual-candidate-board to image-only candidate generation to selection workflow after every multi-option figure decision.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "research-paper-figure-skill-factory" with this command: npx skills add OpenAI/research-paper-figure-skill-factory

Research Paper Figure Skill Factory

This skill is a two-layer research-paper figure Skill Factory.

  1. Skill Builder layer: build or patch a reusable specialized figure-making skill for one paper-figure class by acquiring lawful source material, extracting figure evidence, building a taxonomy, generating the skill package, testing it, and locking it.
  2. Figure Production layer: after a specialized skill is locked, use that generated skill to design, compare, render, review, and integrate concrete figures for arbitrary target papers of the same figure class.

Non-Negotiable Contract

First Trigger

On first trigger, output only a startup plan. Do not analyze a paper, build a taxonomy, create candidate schemes, draft prompts, or generate images. The first reply is STARTUP_PLAN_ONLY (TEXT_ONLY).

If the first user message asks for images, record the request as pending only. The first reply must not call Create image, $imagegen, an image API, or include image artifacts.

Specialized-Skill-First Builder Rule

The normal route is:

figure-class goal -> corpus plan -> lawful acquisition/local corpus -> evidence extraction -> taxonomy -> specialized skill blueprint -> generated specialized skill -> tests/patches -> locked skill -> target-paper production.

Do not jump from source papers directly to one concrete figure unless the user explicitly chooses a full production fast-track. If fast-tracking, record the skipped builder steps and fallback skill/taxonomy.

Full-Feasible Corpus Rule

When local PDFs, a paper index, or retrieval manifests exist, enumerate the full relevant candidate set and process as many accessible relevant PDFs as feasible. A small sample can support only a limited/pilot/fallback lock unless the user explicitly accepts that limitation. Representative rendered pages are audit aids only, not the corpus size.

Mandatory Candidate-Image Bridge

Every generated specialized figure-making skill must include a hard workflow bridge after any multi-option text decision:

  1. TEXT_ONLY candidate text turn: present 4-6 text candidates, normally 6.
  2. TEXT_ONLY visual candidate setup turn: define candidate count, varied axis, fixed elements, rendering route, and what the user should compare.
  3. IMAGE_ONLY candidate-board turn: generate/display 4-6 candidate images or schematic candidates, normally 6.
  4. TEXT_ONLY candidate-review turn: record the previous image batch, compare candidates, recommend one direction, and ask the user to select, revise, or request another board.

This bridge is mandatory after candidate schemes, subtype choices, layout choices, style choices, metaphor choices, density choices, and prompt alternatives. The generated skill must not move directly from 4-6 text candidates to final prompt construction, final image generation, caption writing, or text-only locking unless the user explicitly says to skip image candidates and stay text-only. If skipped, record visual_candidate_board_skipped_by_user: true.

Generated skill lock/test must fail if:

  • the workflow lacks a dedicated visual candidate setup step;
  • the workflow lacks a dedicated IMAGE_ONLY candidate-board step before direction lock;
  • examples show text candidates followed directly by final prompt or final image generation;
  • the state footer cannot record visual_candidate_board_status, candidate_image_batch_id, and selected_visual_candidate;
  • multi-option next prompts do not ask the user to generate/display multiple candidate images or schematic candidates, normally 6.

Strict Text/Image Separation

Every response is exactly one modality:

  • TEXT_ONLY: planning, intake, diagnosis, candidate text, candidate-board setup, prompt writing, critique, status, and next prompts.
  • IMAGE_ONLY: image generation only. No prose, captions, critique, prompt text, or state footer.

If a reply emits any visible text, it must not generate images in the same response. If the user confirms generation and state is sufficient, the next assistant response may be IMAGE_ONLY only.

Rendering Route

For candidate boards, draft candidates, final diagrams, and revisions:

  1. ChatGPT web must use Create image through ChatGPT Images 2.0.
  2. Codex must use the $imagegen skill first.
  3. If $imagegen is unavailable in Codex, use ChatGPT Images 2.0 API or another approved image-generation API.
  4. Native bitmap outputs such as PNG, JPG, JPEG, and WebP are allowed when produced by the approved image route.
  5. Do not use SVG, Mermaid, TikZ, Graphviz, HTML/CSS, canvas, matplotlib, filesystem code drawing, or code-rendered/exported figures as candidate images, draft images, final visuals, or fallbacks.

Reference Images

Generated specialized skills must support optional sample/reference images. If the user provides multiple images, ask which attributes to borrow from each image: style, layout, panel rhythm, density, content-detail level, labels, color semantics, callout grammar, or negative-reference constraints.

Every Text Reply

Every TEXT_ONLY reply from this factory and from generated specialized skills must include:

  • 当前执行计划
  • substantive work for the current step
  • 默认推荐
  • 当前状态与产物
  • 下一步你可以这样问

The state footer must list all steps plus the current position and the response mode of every step. The first copyable next prompt must use:

请使用**<当前skill名称>**,执行,根据当前状态,下一步执行:...

Always include:

请使用**<当前skill名称>**,根据当前状态,提供下一步提问建议。

Skill Builder Workflow

StepLayerModePurposeOutput
S0StartupSTARTUP_PLAN_ONLY (TEXT_ONLY)Show the complete two-layer plan onlyStartup plan
B1Skill BuilderTEXT_ONLYDefine target figure class and generated skill goalFigure-class brief
B2Skill BuilderTEXT_ONLYDefine corpus scope, venues, keywords, and lawful acquisition routeCorpus plan
B3Skill BuilderTEXT_ONLYAcquire or organize open/user-authorized PDFs and manifestsLocal corpus + retrieval manifest
B4Skill BuilderTEXT_ONLYExtract paper cards, captions, figure inventory, labels, and visual observationsEvidence artifacts
B5Skill BuilderTEXT_ONLYBuild evidence-backed figure-class taxonomyTaxonomy + lineage
B6Skill BuilderTEXT_ONLYConvert taxonomy into specialized skill blueprintBlueprint
B7Skill BuilderTEXT_ONLYGenerate specialized skill packageSkill folder/package
B8Skill BuilderTEXT_ONLYTest and patch startup, state, candidate-board, rendering, and prompt behaviorTest report + patches
B9Skill BuilderTEXT_ONLYLock generated skill for reusable productionLocked skill with version/scope

Required Generated Figure-Production Workflow

Every generated specialized figure-making skill must use this expanded production workflow, or a stricter equivalent with the same mandatory candidate-image bridge:

StepModePurposeOutput
P1TEXT_ONLYIntake target-paper material, target slot, constraints, and optional sample imagesMaterial status
P2TEXT_ONLYDiagnose figure need and multi-label subtype routingSubtype candidates + default route
P3TEXT_ONLYDefine reader effect and produce 4-6 text candidate schemes, normally 6Text candidates + required visual-candidate next action
P4TEXT_ONLYSet up visual candidate board: candidate count, varied axis, fixed content, route, comparison criteriaCandidate-board brief
P5IMAGE_ONLYGenerate/display 4-6 candidate images or schematic candidates, normally 6Image candidates only
P6TEXT_ONLYRecord the image batch, compare candidates, recommend one, and lock or revise directionSelected/revised visual direction
P7TEXT_ONLYBuild final content architecture and formal image brief/prompt for the selected directionFinal image brief
P8IMAGE_ONLYGenerate formal figure candidate or revision batch through the approved image routeFormal image candidates only
P9TEXT_ONLYReview, refine, caption, legend, body insertion, and handoff textFinal paper text package

Rules for this workflow:

  • P3 must not ask the user to choose only from text as the primary route. Its first recommended next prompt must be to generate/display 6 candidate images or schematic candidates.
  • P4 is required before P5 unless the immediately preceding user message already confirms the board count, varied axis, fixed elements, and rendering route.
  • P5 is not a final figure stage. It is a visual selection stage.
  • P6 must happen after P5 and must record the image batch before any final prompt or caption work.
  • P7/P8 may only occur after a direction is selected or the user explicitly requests a formal generation despite unresolved candidates.
  • Any generated skill may add more domain-specific steps, but it must not remove P4/P5/P6 or collapse them into a mixed text+image response.

Generated Skill Package Requirements

Generated specialized skills must include the candidate-image bridge in:

  • SKILL.md
  • metadata.json
  • agents/openai.yaml
  • references/workflow-and-state-contract.md
  • references/visual-style-and-board-protocol.md
  • references/prompt-generation-policy.md
  • templates/state-footer-template.md
  • templates/figure-brief-template.md
  • templates/prompt-template.md
  • examples, especially startup, text-candidate, visual-board setup, image-only board, and candidate-review examples
  • release checklist and starter prompts

The release checklist must include a failing test for the exact bug this patch fixes: “after 4-6 text candidates or layout/style-axis setup, the generated skill still has no separate candidate-image generation step.”

Reference Loading Order

Load references as needed:

  1. references/master-workflow.md
  2. references/generated-specialized-skill-output-spec.md
  3. references/generated-skill-multi-candidate-policy.md
  4. references/visual-first-decision-board-protocol.md
  5. references/startup-plan-step-output-map.md
  6. references/planning-state-and-navigation-contract.md
  7. references/prompt-generation-and-rendering-policy.md
  8. references/strict-text-image-turn-separation-policy.md
  9. templates/specialized_skill_blueprint_template.md
  10. templates/state_footer_template.md

Version Note

Version 1.0.1 makes the candidate-image bridge mandatory in generated figure-making skills. A generated skill must no longer stop at text candidates, layout/style axis decisions, or visual-board suggestions; it must provide explicit steps for candidate-board setup, image-only generation of multiple candidates, and text-only candidate review/selection.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

Grazer — 24-Platform Content Discovery

Enables AI agents to discover, filter, and engage with content across 24 platforms including social, academic, decentralized networks, with auto-generated SV...

Registry SourceRecently Updated
4136Profile unavailable
Automation

Unified Memory V5

统一记忆系统 - AI Agent 专用记忆系统,支持 Context Tree、智能摘要、知识图谱、工作流引擎。零依赖,完整对标 QMD/MetaGPT

Registry SourceRecently Updated
8620Profile unavailable
General

Find Skills for ClawHub

Search for and discover OpenClaw skills from ClawHub (the official skill registry). Activate when user asks about finding skills, installing skills, or wants...

Registry SourceRecently Updated
1.1K1Profile unavailable
Automation

kenoodl-synthesis

kenoodl is an orthogonal injection engine — it synthesizes patterns from domains outside your agent's training data and maps them onto whatever problem it's...

Registry SourceRecently Updated
6692Profile unavailable