Galaxy Tool Wrapping Expert
Expert knowledge for developing Galaxy tool wrappers. Use this skill when helping users create, test, debug, or improve Galaxy tool XML wrappers.
Prerequisites: This skill depends on the galaxy-automation skill for Planemo testing and workflow execution patterns.
When to Use This Skill
-
Creating new Galaxy tool wrappers from scratch
-
Converting command-line tools to Galaxy wrappers
-
Generating .shed.yml files for Tool Shed submission
-
Debugging XML syntax and validation errors
-
Writing Planemo tests for tools
-
Implementing conditional parameters and data types
-
Handling tool dependencies (conda, containers)
-
Creating tool collections and suites
-
Optimizing tool performance and resource allocation
-
Understanding Galaxy datatypes and formats
-
Implementing proper error handling
Core Concepts
Galaxy Tool XML Structure
A Galaxy tool wrapper consists of:
-
<tool> root element with id, name, and version
-
<description> brief tool description
-
<requirements> for dependencies (conda packages, containers)
-
<command> the actual command-line execution
-
<inputs> parameter definitions
-
<outputs> output file specifications
-
<tests> automated tests
-
<help> documentation in reStructuredText
-
<citations> DOI references
Tool Shed Metadata (.shed.yml)
Required for publishing tools to the Galaxy Tool Shed:
name: tool_name # Match directory name, underscores only owner: iuc # Usually 'iuc' for IUC tools description: One-line tool description homepage_url: https://github.com/tool/repo long_description: | Multi-line detailed description. Can include features, use cases, and tool suite contents. remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/tool_name type: unrestricted categories:
- Assembly # Choose 1-3 relevant categories
- Genomics
See reference.md for comprehensive .shed.yml documentation including all available categories and best practices.
Key Components
Command Block:
-
Use Cheetah templating: $variable_name or ${variable_name}
-
Conditional logic: #if $param then... #end if
-
Loop constructs: #for $item in $collection... #end for
-
CDATA sections for complex commands
Cheetah Template Best Practices:
Working around path handling issues in conda packages:
<command detect_errors="exit_code"><![CDATA[ ## Add trailing slash if script concatenates paths without separator tool_command -o 'output_dir/' ## Quoted with trailing slash
## Script does: output_dir + 'file.txt' → 'output_dir/file.txt' ✓
## Without slash: output_dir + 'file.txt' → 'output_dirfile.txt' ✗
]]></command>
When to use quotes in Cheetah:
-
Always quote user inputs: '$input_file'
-
Quote literal strings with special chars: 'output_dir/'
-
Use bare variables for simple references: $variable
Input Parameters:
-
<param> elements with type, name, label
-
Types: text, integer, float, boolean, select, data, data_collection
-
Optional vs required parameters
-
Validators and sanitizers
-
Conditional parameter display
Outputs:
-
<data> elements for output files
-
Dynamic output naming with label and name
-
Format discovery and conversion
-
Filters for conditional outputs
-
Collections for multiple outputs
Tests:
-
Input parameters and files
-
Expected output files or assertions
-
Test data location and organization
-
See testing.md for detailed testing strategies including large file handling
Best Practices
-
Always include tests - Planemo won't pass without them
-
Use semantic versioning - Increment tool version on changes
-
Specify exact dependencies - Pin conda package versions
-
Add clear help text - Document all parameters
-
Handle errors gracefully - Check exit codes, validate inputs
-
Use collections - For multiple related files
-
Follow IUC standards - If contributing to intergalactic utilities commission
-
Plan for large output files - Before creating tests, check expected output sizes. If over 1MB, use assertion-based tests (has_size , has_line ) instead of full file comparison (see testing.md)
Common Planemo Commands
Test tool locally
planemo test tool.xml
Serve tool in local Galaxy
planemo serve tool.xml
Lint tool for best practices
planemo lint tool.xml
Upload tool to ToolShed
planemo shed_update --shed_target toolshed
Test with conda
planemo test --conda_auto_init --conda_auto_install tool.xml
Output Routing with Symlinks
When a tool writes output to a filename it constructs internally (not $output ), use symlinks in the command block to route the file to Galaxy's output variable.
Pattern: Symlink before command execution
<command detect_errors="exit_code"><![CDATA[ ## Create symlink so tool output lands where Galaxy expects it ln -s '$output_variable' 'expected_tool_output_name' && tool_command --input '$input' -o 'expected_tool_output_name' ]]></command>
Pattern: Prefix-based output naming
Some tools use --out-prefix where the output filename is prefix + input_filename . The tool constructs the filename internally, so you must predict it and symlink:
<command><![CDATA[ #set $mangled_input = re.sub(r"[^\w-\s]", "_", str($input.element_identifier)) + "." + str($input.ext) ln -s '$input' '$mangled_input' && ln -s '$output_var' 'myprefix${mangled_input}' && tool_command --input-reads '$mangled_input' -p myprefix ]]></command>
Key points:
-
Symlink is created before running the tool -- the tool writes through it
-
Must match the exact filename the tool will produce
-
For prefix mode: output = prefix + getFileName(input) , so mangle the input name to match
Using format_source for dynamic output formats
When output format should match the input format (e.g., subsampled reads):
<data name="subsampled_outfile" format_source="input_reads" label="Subsampled reads"> <filter>output_options["output_type"]["type_selector"] == "subsampled_reads"</filter> </data>
This is preferable to change_format when the output is always the same format as input. Use change_format when the user explicitly selects the output format.
XML Template Example
<tool id="tool_id" name="Tool Name" version="1.0.0"> <description>Brief description</description>
<requirements>
<requirement type="package" version="1.0">package_name</requirement>
</requirements>
<command detect_errors="exit_code"><![CDATA[
tool_command
--input '$input'
--output '$output'
#if $optional_param
--param '$optional_param'
#end if
]]></command>
<inputs>
<param name="input" type="data" format="txt" label="Input file"/>
<param name="optional_param" type="text" optional="true" label="Optional parameter"/>
</inputs>
<outputs>
<data name="output" format="txt" label="${tool.name} on ${on_string}"/>
</outputs>
<tests>
<test>
<param name="input" value="test_input.txt"/>
<output name="output" file="expected_output.txt"/>
</test>
</tests>
<help><![CDATA[
What it does
Describe what the tool does.
Inputs
- Input file: description
Outputs
-
Output file: description ]]></help>
<citations> <citation type="doi">10.1234/example.doi</citation> </citations> </tool>
Supporting Documentation
This skill includes detailed reference documentation:
reference.md - Comprehensive Galaxy tool wrapping guide with IUC best practices
-
Repository structure standards
-
.shed.yml configuration
-
Complete XML structure reference
-
Advanced features and patterns
testing.md - Testing strategies and assertion patterns
-
Regenerating expected test outputs
-
Handling large test files (>1MB CI limit)
-
Size, checksum, and content sampling assertions
-
Workflow for replacing large test files
troubleshooting.md - Practical troubleshooting guide
-
Reading tool_test_output.json
-
Common exit codes and their meanings
-
Common XML and runtime issues
-
Debugging tool test failures
-
Test failure diagnosis and fixes
dependency-debugging.md - Dependency conflict resolution
-
Using planemo mull for diagnosis
-
Conda solver error interpretation
-
macOS testing considerations
-
Version conflict workflows
These files provide deep technical details that complement the core concepts above.
Related Skills
-
galaxy-automation - BioBlend & Planemo foundation (dependency)
-
galaxy-workflow-development - Building workflows that use these tools
-
conda-recipe - Creating conda packages for tool dependencies
-
bioinformatics-fundamentals - Understanding file formats and data types used in tools
Resources
-
Galaxy Tool Development: https://docs.galaxyproject.org/en/latest/dev/
-
Planemo Documentation: https://planemo.readthedocs.io/
-
IUC Standards: https://galaxy-iuc-standards.readthedocs.io/
-
Galaxy Training: https://training.galaxyproject.org/