ToolUniverse
Overview
ToolUniverse is a unified ecosystem that enables AI agents to function as research scientists by providing standardized access to 600+ scientific resources. Use this skill to discover, execute, and compose scientific tools across multiple research domains including bioinformatics, cheminformatics, genomics, structural biology, proteomics, and drug discovery.
Key Capabilities:
-
Access 600+ scientific tools, models, datasets, and APIs
-
Discover tools using natural language, semantic search, or keywords
-
Execute tools through standardized AI-Tool Interaction Protocol
-
Compose multi-step workflows for complex research problems
-
Integration with Claude Desktop/Code via Model Context Protocol (MCP)
When to Use This Skill
Use this skill when:
-
Searching for scientific tools by function or domain (e.g., "find protein structure prediction tools")
-
Executing computational biology workflows (e.g., disease target identification, drug discovery, genomics analysis)
-
Accessing scientific databases (OpenTargets, PubChem, UniProt, PDB, ChEMBL, KEGG, etc.)
-
Composing multi-step research pipelines (e.g., target discovery → structure prediction → virtual screening)
-
Working with bioinformatics, cheminformatics, or structural biology tasks
-
Analyzing gene expression, protein sequences, molecular structures, or clinical data
-
Performing literature searches, pathway enrichment, or variant annotation
-
Building automated scientific research workflows
Quick Start
Basic Setup
from tooluniverse import ToolUniverse
Initialize and load tools
tu = ToolUniverse() tu.load_tools() # Loads 600+ scientific tools
Discover tools
tools = tu.run({ "name": "Tool_Finder_Keyword", "arguments": { "description": "disease target associations", "limit": 10 } })
Execute a tool
result = tu.run({ "name": "OpenTargets_get_associated_targets_by_disease_efoId", "arguments": {"efoId": "EFO_0000537"} # Hypertension })
Model Context Protocol (MCP)
For Claude Desktop/Code integration:
tooluniverse-smcp
Core Workflows
- Tool Discovery
Find relevant tools for your research task:
Three discovery methods:
-
Tool_Finder
-
Embedding-based semantic search (requires GPU)
-
Tool_Finder_LLM
-
LLM-based semantic search (no GPU required)
-
Tool_Finder_Keyword
-
Fast keyword search
Example:
Search by natural language description
tools = tu.run({ "name": "Tool_Finder_LLM", "arguments": { "description": "Find tools for RNA sequencing differential expression analysis", "limit": 10 } })
Review available tools
for tool in tools: print(f"{tool['name']}: {tool['description']}")
See references/tool-discovery.md for:
-
Detailed discovery methods and search strategies
-
Domain-specific keyword suggestions
-
Best practices for finding tools
- Tool Execution
Execute individual tools through the standardized interface:
Example:
Execute disease-target lookup
targets = tu.run({ "name": "OpenTargets_get_associated_targets_by_disease_efoId", "arguments": {"efoId": "EFO_0000616"} # Breast cancer })
Get protein structure
structure = tu.run({ "name": "AlphaFold_get_structure", "arguments": {"uniprot_id": "P12345"} })
Calculate molecular properties
properties = tu.run({ "name": "RDKit_calculate_descriptors", "arguments": {"smiles": "CCO"} # Ethanol })
See references/tool-execution.md for:
-
Real-world execution examples across domains
-
Tool parameter handling and validation
-
Result processing and error handling
-
Best practices for production use
- Tool Composition and Workflows
Compose multiple tools for complex research workflows:
Drug Discovery Example:
1. Find disease targets
targets = tu.run({ "name": "OpenTargets_get_associated_targets_by_disease_efoId", "arguments": {"efoId": "EFO_0000616"} })
2. Get protein structures
structures = [] for target in targets[:5]: structure = tu.run({ "name": "AlphaFold_get_structure", "arguments": {"uniprot_id": target['uniprot_id']} }) structures.append(structure)
3. Screen compounds
hits = [] for structure in structures: compounds = tu.run({ "name": "ZINC_virtual_screening", "arguments": { "structure": structure, "library": "lead-like", "top_n": 100 } }) hits.extend(compounds)
4. Evaluate drug-likeness
drug_candidates = [] for compound in hits: props = tu.run({ "name": "RDKit_calculate_drug_properties", "arguments": {"smiles": compound['smiles']} }) if props['lipinski_pass']: drug_candidates.append(compound)
See references/tool-composition.md for:
-
Complete workflow examples (drug discovery, genomics, clinical)
-
Sequential and parallel tool composition patterns
-
Output processing hooks
-
Workflow best practices
Scientific Domains
ToolUniverse supports 600+ tools across major scientific domains:
Bioinformatics:
-
Sequence analysis, alignment, BLAST
-
Gene expression (RNA-seq, DESeq2)
-
Pathway enrichment (KEGG, Reactome, GO)
-
Variant annotation (VEP, ClinVar)
Cheminformatics:
-
Molecular descriptors and fingerprints
-
Drug discovery and virtual screening
-
ADMET prediction and drug-likeness
-
Chemical databases (PubChem, ChEMBL, ZINC)
Structural Biology:
-
Protein structure prediction (AlphaFold)
-
Structure retrieval (PDB)
-
Binding site detection
-
Protein-protein interactions
Proteomics:
-
Mass spectrometry analysis
-
Protein databases (UniProt, STRING)
-
Post-translational modifications
Genomics:
-
Genome assembly and annotation
-
Copy number variation
-
Clinical genomics workflows
Medical/Clinical:
-
Disease databases (OpenTargets, OMIM)
-
Clinical trials and FDA data
-
Variant classification
See references/domains.md for:
-
Complete domain categorization
-
Tool examples by discipline
-
Cross-domain applications
-
Search strategies by domain
Reference Documentation
This skill includes comprehensive reference files that provide detailed information for specific aspects:
-
references/installation.md
-
Installation, setup, MCP configuration, platform integration
-
references/tool-discovery.md
-
Discovery methods, search strategies, listing tools
-
references/tool-execution.md
-
Execution patterns, real-world examples, error handling
-
references/tool-composition.md
-
Workflow composition, complex pipelines, parallel execution
-
references/domains.md
-
Tool categorization by domain, use case examples
-
references/api_reference.md
-
Python API documentation, hooks, protocols
Workflow: When helping with specific tasks, reference the appropriate file for detailed instructions. For example, if searching for tools, consult references/tool-discovery.md for search strategies.
Example Scripts
Two executable example scripts demonstrate common use cases:
scripts/example_tool_search.py
-
Demonstrates all three discovery methods:
-
Keyword-based search
-
LLM-based search
-
Domain-specific searches
-
Getting detailed tool information
scripts/example_workflow.py
-
Complete workflow examples:
-
Drug discovery pipeline (disease → targets → structures → screening → candidates)
-
Genomics analysis (expression data → differential analysis → pathways)
Run examples to understand typical usage patterns and workflow composition.
Best Practices
Tool Discovery:
-
Start with broad searches, then refine based on results
-
Use Tool_Finder_Keyword for fast searches with known terms
-
Use Tool_Finder_LLM for complex semantic queries
-
Set appropriate limit parameter (default: 10)
Tool Execution:
-
Always verify tool parameters before execution
-
Implement error handling for production workflows
-
Validate input data formats (SMILES, UniProt IDs, gene symbols)
-
Check result types and structures
Workflow Composition:
-
Test each step individually before composing full workflows
-
Implement checkpointing for long workflows
-
Consider rate limits for remote APIs
-
Use parallel execution when tools are independent
Integration:
-
Initialize ToolUniverse once and reuse the instance
-
Call load_tools() once at startup
-
Cache frequently used tool information
-
Enable logging for debugging
Key Terminology
-
Tool: A scientific resource (model, dataset, API, package) accessible through ToolUniverse
-
Tool Discovery: Finding relevant tools using search methods (Finder, LLM, Keyword)
-
Tool Execution: Running a tool with specific arguments via tu.run()
-
Tool Composition: Chaining multiple tools for multi-step workflows
-
MCP: Model Context Protocol for integration with Claude Desktop/Code
-
AI-Tool Interaction Protocol: Standardized interface for LLM-tool communication
Resources
-
Official Website: https://aiscientist.tools
-
Documentation: https://zitniklab.hms.harvard.edu/ToolUniverse/
-
Installation: uv pip install tooluniverse
-
MCP Server: tooluniverse-smcp