Cjl Autoresearch Cc

# cjl-autoresearch-cc

Safety Notice

This item is sourced from the public archived skills repository. Treat as untrusted until reviewed.

Copy this and send it to your AI assistant to learn

Install skill "Cjl Autoresearch Cc" with this command: npx skills add 0xcjl/cjl-autoresearch-cc

cjl-autoresearch-cc

Overview

Improve skills, prompts, articles, workflows, and systems via iterative mutation-testing.

Core principle: One small verifiable change per round. Large rewrites are unverifiable and will be reverted.

Workflow: small edits → test → score → keep improvements, discard regressions.

Inspired by Karpathy/autoresearch and 0xcjl/openclaw-autoresearch-pro.

Trigger Keywords

English: autoresearch

Chinese: 自动优化, 自动研究

Semantic Triggers (No Keywords Needed)

This skill activates when the user's intent matches, even without explicit keywords:

  • User wants to improve any skill, prompt, article, workflow, or system
  • User asks to polish, refine, enhance, or upgrade content
  • User wants iterative testing and improvement
  • User says '帮我改进一下这个prompt', 'optimize this'
  • User says '迭代优化', '循环改进', '反复打磨'
  • User asks '能不能更好', '如何提升质量'
  • User uses "打磨", "精炼", "完善", "升级" in context of content improvement

Supported Optimization Targets

ModeInputExample
SkillSkill name or pathcoding-standards, ~/.claude/skills/tdd-workflow
PluginPath to a plugin directory~/.claude/plugins/everything-claude-code
PromptA prompt text stringInline or file path
ArticleAn article/document textInline or file path
WorkflowA process or workflow descriptionInline or file path
SystemA mechanism or system designInline or file path

Workflow

Step 1 — Identify Mode and Target

Before proceeding, confirm with user:

"Optimize [target] in [mode] mode? (yes/no)"

If no, ask for clarification. If yes, proceed to Step 2.

Parse the user's request to determine mode. Check for:

Keyword triggers:

  • autoresearch [target] / 自动优化 [target] / 自动研究 [target]
  • optimize [target] / improve [target] / 优化 [target] / 改进 [target]
  • refine [target] / enhance [target] / 精炼 [target] / 增强 [target]

Semantic triggers (intent-based):

  • User wants to improve any skill, prompt, article, workflow, or system
  • User asks to polish, refine, enhance, or upgrade content
  • User describes wanting iterative testing and improvement

Mode detection from intent:

User IntentMode
Optimize a skill/SKILL.md fileSkill
Optimize an agent configurationSkill
Improve a custom commandSkill
Optimize a pluginPlugin
Improve hooks configurationPlugin
Improve a prompt textPrompt
Polish an article/documentArticle
Optimize a workflow/processWorkflow
Improve a system mechanismSystem

For Skill/Plugin mode, resolve the path:

  • Skill: ~/.claude/skills/<skill-name>/SKILL.md
  • Plugin: ~/.claude/plugins/<plugin-name>/

If path doesn't exist, search in order: ~/.claude/skills/ → current dir → ask user.

Examples of semantic triggers (no keywords):

  • "帮我优化一下这个skill" → Skill mode
  • "这个prompt不够好,帮我改进" → Prompt mode
  • "我想让这篇文章更通顺" → Article mode
  • "优化一下部署流程" → Workflow mode

Step 2 — Generate Checklist (10 Questions)

Read the target content first. Then generate 10 diverse, specific yes/no checklist questions relevant to the content type:

For Skill/Plugin mode:

#DimensionWhat to Check
1Description clarityIs the description precise, actionable, and clear? Does it state what the skill does and when to use it?
2Trigger coverageDoes it cover main real-world use cases?
3Workflow structureAre steps clearly sequenced and unambiguous?
4Error guidanceDoes it handle error states and edge cases?
5Tool usage accuracyAre tool names and parameters correct for Claude Code?
6Example qualityDo examples reflect real usage patterns?
7ConcisenessIs content free of redundant repetition?
8Freedom calibrationIs instruction specificity appropriate?
9Reference qualityAre references and links accurate?
10CompletenessAre all sections filled with real content?

For Prompt mode:

#DimensionWhat to Check
1Goal clarityDoes the prompt state a clear, specific goal?
2Role/toneIs the desired role or tone specified?
3Input formatIs the input format clearly described?
4Output formatIs the expected output format specified?
5ConstraintsAre key constraints and boundaries stated?
6Context sufficiencyIs enough context provided to avoid hallucination?
7Edge casesDoes it handle ambiguous or edge case inputs?
8ConcisenessIs it free of redundant or contradictory instructions?
9ActionabilityAre instructions concrete and actionable vs. vague?
10CompletenessAre all necessary elements for the task present?

For Article/Documentation mode:

#DimensionWhat to Check
1Title qualityDoes the title clearly convey the main value? Is it specific enough?
2Opening hookDoes the opening grab attention? Does it set clear expectations?
3Logical structureAre ideas logically organized (not random)?
4Argument clarityAre claims supported with evidence or reasoning?
5ConcisenessIs unnecessary padding or repetition removed?
6Transition flowDo paragraphs/sections flow smoothly?
7Closing strengthDoes the conclusion summarize and inspire action?
8Tone consistencyIs the tone consistent throughout?
9ReadabilityIs sentence/paragraph length varied appropriately?
10Audience matchDoes language match the target audience level?

For Workflow/System mode:

#DimensionWhat to Check
1Goal clarityIs the objective clearly stated?
2Step sequencingAre steps in logical, efficient order?
3CompletenessAre all necessary steps present?
4Error handlingAre failure modes addressed (timeout, auth, network, resource exhaustion)?
5Edge casesAre corner cases considered (empty input, large files)?
6SimplicityIs the workflow/system as simple as possible? Can steps be combined or eliminated?
7ObservabilityCan progress/status be tracked?
8ReversibilityCan steps be undone if errors occur?
9Automation potentialWhich steps could be automated?
10MaintainabilityIs it easy to modify and extend?

Present the 10 questions, numbered 1-10. Ask the user to select which ones to activate.

Rule: Must use at least 5 questions. Using fewer makes scoring unreliable.

After presenting, ask: "Ready to start the optimization loop? (yes/start)"

Step 3 — Prepare Test Cases

Test cases validate that mutations improve, not harm, the content. Generate realistic user scenarios.

  • Skill/Plugin mode: Generate 3-5 realistic prompts a user would send when using the skill/plugin
  • Prompt mode: Generate 3-5 test inputs that the prompt would process
  • Article mode: Generate 3-5 ways the article might be read or consumed
  • Workflow mode: Generate 3-5 scenarios the workflow would handle
  • System mode: Generate 3-5 conditions the system would encounter

Store test cases in context — do not write to disk unless needed.

Step 4 — Run Autoresearch Loop

Tip: For mutation strategies, see Mutation Strategy Reference below.

Loop configuration:

  • Rounds per batch: 30
  • Max total rounds: 100
  • Pause: After every 30 rounds, show summary and ask user to continue or stop
  • Stop conditions:
    • User says stop
    • 100 rounds completed
    • Score reaches 100%
    • No improvement for 10 consecutive rounds

Per-round procedure:

Track progress: Round N/100 | Best: XX% | Last: +/-YY

Constraint: ONE mutation per round. Multiple changes = unverifiable = will be reverted.

  1. Mutate: Make ONE small edit (see Mutation types)

  2. Test: For each test case, simulate what output the content would produce

    Constraint: Be honest. If the output would disappoint a user, the mutation failed.

  3. Score: Apply each active checklist question (0 or 1 per question). Score = (passed / total_questions) × 100

    Scoring scale:

    • 10/10 = 100% (perfect)
    • 7/10 = 70% (good)
    • 5/10 = 50% (minimum viable)
  4. Decide: If new score ≥ best score → keep the mutation. If lower → revert

    Example: Best=85%, New=87% → Keep. Best=85%, New=83% → Revert.

    Trust the score. Don't rationalize a bad mutation.

  5. Log: Round number, mutation type, score, keep/revert decision

Mutation types (pick ONE per round):

TypeNameWhen to Use
AAdd constraintWhen content is too vague
BStrengthen coverageWhen trigger cases are missing
CAdd exampleWhen steps are too abstract
DTighten languageWhen words are soft ("try to")
EError handlingWhen failure modes missing
FRemove redundancyWhen content is verbose
GImprove transitionsWhen flow is choppy
HExpand thin sectionWhen content is sparse
IAdd cross-refWhen sections are isolated
JAdjust freedomWhen balance is off

Step 5 — Report Results (after each batch)

See Quick Reference below for output format examples.

After each batch (30 rounds):

Example:

Batch 1 (rounds 1-30):
  Best score: 85%
  Mutations kept: 23  |  Reverted: 7
  Most effective types: A, C, D

After full completion:

Optimized: [filename/path]
Score: XX% → YY% (+ZZ%)
Rounds: N (kept: K, reverted: R)
Top mutations: [type, type, type]
---
Final content:
[diff or inline]

Mutation Strategy Reference

High-impact, low-risk changes:

  • Adding explicit constraints where the content is vague
  • Expanding coverage to cover edge cases
  • Adding concrete examples to abstract instructions
  • Tightening soft language ("try to" → "must")

Avoid in one round:

  • Large rewrites of entire sections
  • Multiple unrelated changes at once
  • Changing fundamental scope or purpose
  • Formatting-only changes (no testable value)
  • Adding content the user didn't request
  • Removing more than 10% of content

Quick Reference

Keywords Reference

Auto-detect: autoresearch, 自动优化, 自动研究 Skill: autoresearch ~/.claude/skills/tdd Prompt: optimize this prompt: [text] Workflow: optimize the deployment workflow System: improve the error handling system

Semantic Triggers (No Keywords)

"帮我优化一下这个skill"           # → Skill mode
"这个prompt不太行"               # → Prompt mode
"我想让文章更通顺"               # → Article mode
"优化一下部署流程"               # → Workflow mode
"改进一下这个系统"               # → System mode
"improve this code review"         # → Prompt/Skill mode
"polish this documentation"       # → Article mode

Mode Detection

SituationAction
Path detectedSkill/Plugin mode
Keyword presentKeyword-specified mode
Short textPrompt mode
Long documentArticle mode
UncertainPrompt mode (default)

Edge cases: Empty → ask. Invalid path → fallback to ~/.claude/skills/. Ambiguous → ask.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

backtester

Professional backtesting framework for trading strategies. Tests SMA crossover, RSI, MACD, Bollinger Bands, and custom strategies on historical data. Generates equity curves, drawdown analysis, and performance metrics.

Archived SourceRecently Updated
Research

Beta Knowledge

# Beta Knowledge Base

Archived SourceRecently Updated
Research

competitor-analysis

Comprehensive competitor analysis framework. Research competitors, compare products, identify gaps, and find positioning opportunities. Used by startups, investors, and product teams.

Archived SourceRecently Updated
Research

backtester

Professional backtesting framework for trading strategies. Tests SMA crossover, RSI, MACD, Bollinger Bands, and custom strategies on historical data. Generates equity curves, drawdown analysis, and performance metrics.

Archived SourceRecently Updated