ToolUniverse Self-Evolution Orchestrator

Coordinates the full development lifecycle by dispatching to specialized devtu skills.

The Cycle

Discover → Create → Test → Fix → Optimize → Ship → Repeat

Each phase maps to a dedicated skill:

Phase	Skill	What it does
Discover	`devtu-auto-discover-apis`	Gap analysis, web search for APIs, batch discovery
Create	`devtu-create-tool`	Build tool class + JSON config + test examples
Test	(this skill)	Launch researcher persona agents to find issues
Fix	`devtu-fix-tool`	Diagnose failures, implement fixes, validate
Optimize	`devtu-optimize-skills`	Improve skill reports, evidence handling, UX
Optimize	`devtu-optimize-descriptions`	Improve tool JSON descriptions for clarity
Docs	`devtu-docs-quality`	Validate documentation accuracy
Ship	`devtu-github`	Branch, commit, push, create PR

Quick Start

Pick an entry point based on what's needed:

"Run a test round" → jump to Testing Phase
"Expand coverage" → invoke Skill(skill="devtu-auto-discover-apis")
"Create a new tool" → invoke Skill(skill="devtu-create-tool")
"Fix a broken tool" → invoke Skill(skill="devtu-fix-tool")
"Improve skills" → invoke Skill(skill="devtu-optimize-skills")
"Full cycle" → follow all phases below in order

Phase 1: Discovery (optional)

Invoke Skill(skill="devtu-auto-discover-apis") to:

Run gap analysis on current tool categories
Search for life science APIs in underrepresented domains
Score and prioritize APIs by coverage, reliability, documentation

Phase 2: Tool Creation (optional)

Invoke Skill(skill="devtu-create-tool") for each new API:

Create Python tool class implementing the API
Create JSON config with parameters, descriptions, test examples
Register in _lazy_registry_static.py and default_config.py
Validate: python -m tooluniverse.cli test <ToolName>

Phase 3: Testing Phase

This is the core testing loop, run directly by this skill.

Setup

Check for open PRs: gh pr list --state open
If unmerged PR → use that branch; if merged → new branch from origin/main
Rebase: git fetch origin && git rebase origin/main

Researcher Persona Agents

Launch 2 agents per round (A + B) using the Agent tool with these parameters:

Each agent gets:

Domain specialty (oncology, genomics, pharmacology, etc.)
Research question (specific biological question)
5-7 test scenarios exercising different tools
Instructions to report issues with severity (HIGH/MEDIUM/LOW)
Issue IDs: Feature-{round}{letter}-{num} (e.g., Feature-59A-001)

Agent prompt template — see references/persona-template.md

Verification (CRITICAL)

Before implementing ANY agent-reported issue, verify via CLI:

python3 -m tooluniverse.cli run <ToolName> '<json_args>'

50%+ of agent reports are false positives from MCP interface confusion. Only fix verified issues.

Fix Principles

Prevent, don't recover — fix root cause, not symptoms
Validate at input — reject bad params early with clear guidance
Distinguish "no data" from "bad query" — different messages for each
Fix the abstraction — don't add alias lists that grow forever

Anti-patterns: hint text instead of validation, parameter aliases instead of fixing naming, post-hoc probing instead of pre-validation.

Phase 4: Fix & Commit

Implement verified fixes (see references/bug-patterns.md for code-level patterns)
Run code-simplifier: Skill(skill="simplify") — always after writing or modifying code
Lint: ruff check src/tooluniverse/<file>.py
Verify syntax: python -c "from tooluniverse.<module> import <Class>"
Test: python -m tooluniverse.cli run <Tool> '<json>'
Pre-commit hook pattern: stage → commit (fails, reformats) → re-stage → commit
Push: git push origin <branch>

Also see Skill(skill="devtu-code-optimization") for reusable fix patterns and anti-patterns.

Phase 5: Optimize (optional)

After fixes are stable:

Skill(skill="devtu-optimize-descriptions") — improve tool descriptions
Skill(skill="devtu-optimize-skills") — improve research skill quality
Skill(skill="devtu-docs-quality") — validate docs accuracy

Phase 6: Ship

Invoke Skill(skill="devtu-github") or manually:

Rebase: git fetch origin && git stash && git rebase origin/main && git stash pop
git push --force-with-lease origin <branch>
Create or update PR: gh pr create / verify with gh pr view <N> --json mergeable
Verify "mergeable": "MERGEABLE" before reporting done

GitHub repo: mims-harvard/ToolUniverse — always verify with git remote -v before pushing.

Git Rules (CRITICAL)

NEVER push to main — all work on feature branches
NEVER have multiple open fix PRs — keep adding to current branch
Always rebase before push: git fetch origin && git rebase origin/main
Commit message format: no "BUG" terminology, use "Feature" or "Fix"
No AI attribution in commits

Common Issue Categories

Category	Signal
Silent parameter miss	Wrong-field check; param ignored
Always-fires conditional	`.get("field")` on wrong type
Silent normalization	Auto-transform not disclosed
Wrong notation/case	Gene fusions, Title Case names
Substring match	Short symbol returns multiple targets
try/except indent	Mismatched → SyntaxError

Full patterns → references/bug-patterns.md

devtu-self-evolve

Safety Notice

Copy this and send it to your AI assistant to learn