devtu-self-evolve

Orchestrate the full ToolUniverse self-improvement cycle: discover APIs, create tools, test with researcher personas, fix issues, optimize skills, and push via git. References and dispatches to all other devtu skills. Use when asked to: run the self-improvement loop, do a debug/test round, expand tool coverage, improve tool quality, or evolve ToolUniverse.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "devtu-self-evolve" with this command: npx skills add mims-harvard/tooluniverse/mims-harvard-tooluniverse-devtu-self-evolve

ToolUniverse Self-Evolution Orchestrator

Coordinates the full development lifecycle by dispatching to specialized devtu skills.

The Cycle

Discover → Create → Test → Fix → Optimize → Ship → Repeat

Each phase maps to a dedicated skill:

PhaseSkillWhat it does
Discoverdevtu-auto-discover-apisGap analysis, web search for APIs, batch discovery
Createdevtu-create-toolBuild tool class + JSON config + test examples
Test(this skill)Launch researcher persona agents to find issues
Fixdevtu-fix-toolDiagnose failures, implement fixes, validate
Optimizedevtu-optimize-skillsImprove skill reports, evidence handling, UX
Optimizedevtu-optimize-descriptionsImprove tool JSON descriptions for clarity
Docsdevtu-docs-qualityValidate documentation accuracy
Shipdevtu-githubBranch, commit, push, create PR

Quick Start

Pick an entry point based on what's needed:

  • "Run a test round" → jump to Testing Phase
  • "Expand coverage" → invoke Skill(skill="devtu-auto-discover-apis")
  • "Create a new tool" → invoke Skill(skill="devtu-create-tool")
  • "Fix a broken tool" → invoke Skill(skill="devtu-fix-tool")
  • "Improve skills" → invoke Skill(skill="devtu-optimize-skills")
  • "Full cycle" → follow all phases below in order

Phase 1: Discovery (optional)

Invoke Skill(skill="devtu-auto-discover-apis") to:

  1. Run gap analysis on current tool categories
  2. Search for life science APIs in underrepresented domains
  3. Score and prioritize APIs by coverage, reliability, documentation

Phase 2: Tool Creation (optional)

Invoke Skill(skill="devtu-create-tool") for each new API:

  1. Create Python tool class implementing the API
  2. Create JSON config with parameters, descriptions, test examples
  3. Register in _lazy_registry_static.py and default_config.py
  4. Validate: python -m tooluniverse.cli test <ToolName>

Phase 3: Testing Phase

This is the core testing loop, run directly by this skill.

Setup

  1. Check for open PRs: gh pr list --state open
  2. If unmerged PR → use that branch; if merged → new branch from origin/main
  3. Rebase: git fetch origin && git rebase origin/main

Researcher Persona Agents

Launch 2 agents per round (A + B) using the Agent tool with these parameters:

Each agent gets:

  • Domain specialty (oncology, genomics, pharmacology, etc.)
  • Research question (specific biological question)
  • 5-7 test scenarios exercising different tools
  • Instructions to report issues with severity (HIGH/MEDIUM/LOW)
  • Issue IDs: Feature-{round}{letter}-{num} (e.g., Feature-59A-001)

Agent prompt template — see references/persona-template.md

Verification (CRITICAL)

Before implementing ANY agent-reported issue, verify via CLI:

python3 -m tooluniverse.cli run <ToolName> '<json_args>'

50%+ of agent reports are false positives from MCP interface confusion. Only fix verified issues.

Fix Principles

  1. Prevent, don't recover — fix root cause, not symptoms
  2. Validate at input — reject bad params early with clear guidance
  3. Distinguish "no data" from "bad query" — different messages for each
  4. Fix the abstraction — don't add alias lists that grow forever

Anti-patterns: hint text instead of validation, parameter aliases instead of fixing naming, post-hoc probing instead of pre-validation.

Phase 4: Fix & Commit

  1. Implement verified fixes (see references/bug-patterns.md for code-level patterns)
  2. Run code-simplifier: Skill(skill="simplify") — always after writing or modifying code
  3. Lint: ruff check src/tooluniverse/<file>.py
  4. Verify syntax: python -c "from tooluniverse.<module> import <Class>"
  5. Test: python -m tooluniverse.cli run <Tool> '<json>'
  6. Pre-commit hook pattern: stage → commit (fails, reformats) → re-stage → commit
  7. Push: git push origin <branch>

Also see Skill(skill="devtu-code-optimization") for reusable fix patterns and anti-patterns.

Phase 5: Optimize (optional)

After fixes are stable:

  • Skill(skill="devtu-optimize-descriptions") — improve tool descriptions
  • Skill(skill="devtu-optimize-skills") — improve research skill quality
  • Skill(skill="devtu-docs-quality") — validate docs accuracy

Phase 6: Ship

Invoke Skill(skill="devtu-github") or manually:

  1. Rebase: git fetch origin && git stash && git rebase origin/main && git stash pop
  2. git push --force-with-lease origin <branch>
  3. Create or update PR: gh pr create / verify with gh pr view <N> --json mergeable
  4. Verify "mergeable": "MERGEABLE" before reporting done

GitHub repo: mims-harvard/ToolUniverse — always verify with git remote -v before pushing.


Git Rules (CRITICAL)

  • NEVER push to main — all work on feature branches
  • NEVER have multiple open fix PRs — keep adding to current branch
  • Always rebase before push: git fetch origin && git rebase origin/main
  • Commit message format: no "BUG" terminology, use "Feature" or "Fix"
  • No AI attribution in commits

Common Issue Categories

CategorySignal
Silent parameter missWrong-field check; param ignored
Always-fires conditional.get("field") on wrong type
Silent normalizationAuto-transform not disclosed
Wrong notation/caseGene fusions, Title Case names
Substring matchShort symbol returns multiple targets
try/except indentMismatched → SyntaxError

Full patterns → references/bug-patterns.md

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

devtu-optimize-skills

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

devtu-create-tool

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

devtu-optimize-descriptions

No summary provided by upstream source.

Repository SourceNeeds Review