galaxy-tool-wrapping

Galaxy Tool Wrapping Expert

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "galaxy-tool-wrapping" with this command: npx skills add delphine-l/claude_global/delphine-l-claude-global-galaxy-tool-wrapping

Galaxy Tool Wrapping Expert

Expert knowledge for developing Galaxy tool wrappers. Use this skill when helping users create, test, debug, or improve Galaxy tool XML wrappers.

Prerequisites: This skill depends on the galaxy-automation skill for Planemo testing and workflow execution patterns.

When to Use This Skill

  • Creating new Galaxy tool wrappers from scratch

  • Converting command-line tools to Galaxy wrappers

  • Generating .shed.yml files for Tool Shed submission

  • Debugging XML syntax and validation errors

  • Writing Planemo tests for tools

  • Implementing conditional parameters and data types

  • Handling tool dependencies (conda, containers)

  • Creating tool collections and suites

  • Optimizing tool performance and resource allocation

  • Understanding Galaxy datatypes and formats

  • Implementing proper error handling

Core Concepts

Galaxy Tool XML Structure

A Galaxy tool wrapper consists of:

  • <tool> root element with id, name, and version

  • <description> brief tool description

  • <requirements> for dependencies (conda packages, containers)

  • <command> the actual command-line execution

  • <inputs> parameter definitions

  • <outputs> output file specifications

  • <tests> automated tests

  • <help> documentation in reStructuredText

  • <citations> DOI references

Tool Shed Metadata (.shed.yml)

Required for publishing tools to the Galaxy Tool Shed:

name: tool_name # Match directory name, underscores only owner: iuc # Usually 'iuc' for IUC tools description: One-line tool description homepage_url: https://github.com/tool/repo long_description: | Multi-line detailed description. Can include features, use cases, and tool suite contents. remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/tool_name type: unrestricted categories:

  • Assembly # Choose 1-3 relevant categories
  • Genomics

See reference.md for comprehensive .shed.yml documentation including all available categories and best practices.

Key Components

Command Block:

  • Use Cheetah templating: $variable_name or ${variable_name}

  • Conditional logic: #if $param then... #end if

  • Loop constructs: #for $item in $collection... #end for

  • CDATA sections for complex commands

Cheetah Template Best Practices:

Working around path handling issues in conda packages:

<command detect_errors="exit_code"><![CDATA[ ## Add trailing slash if script concatenates paths without separator tool_command -o 'output_dir/' ## Quoted with trailing slash

## Script does: output_dir + 'file.txt' → 'output_dir/file.txt' ✓
## Without slash: output_dir + 'file.txt' → 'output_dirfile.txt' ✗

]]></command>

When to use quotes in Cheetah:

  • Always quote user inputs: '$input_file'

  • Quote literal strings with special chars: 'output_dir/'

  • Use bare variables for simple references: $variable

Input Parameters:

  • <param> elements with type, name, label

  • Types: text, integer, float, boolean, select, data, data_collection

  • Optional vs required parameters

  • Validators and sanitizers

  • Conditional parameter display

Outputs:

  • <data> elements for output files

  • Dynamic output naming with label and name

  • Format discovery and conversion

  • Filters for conditional outputs

  • Collections for multiple outputs

Tests:

  • Input parameters and files

  • Expected output files or assertions

  • Test data location and organization

  • See testing.md for detailed testing strategies including large file handling

Best Practices

  • Always include tests - Planemo won't pass without them

  • Use semantic versioning - Increment tool version on changes

  • Specify exact dependencies - Pin conda package versions

  • Add clear help text - Document all parameters

  • Handle errors gracefully - Check exit codes, validate inputs

  • Use collections - For multiple related files

  • Follow IUC standards - If contributing to intergalactic utilities commission

  • Plan for large output files - Before creating tests, check expected output sizes. If over 1MB, use assertion-based tests (has_size , has_line ) instead of full file comparison (see testing.md)

Common Planemo Commands

Test tool locally

planemo test tool.xml

Serve tool in local Galaxy

planemo serve tool.xml

Lint tool for best practices

planemo lint tool.xml

Upload tool to ToolShed

planemo shed_update --shed_target toolshed

Test with conda

planemo test --conda_auto_init --conda_auto_install tool.xml

Output Routing with Symlinks

When a tool writes output to a filename it constructs internally (not $output ), use symlinks in the command block to route the file to Galaxy's output variable.

Pattern: Symlink before command execution

<command detect_errors="exit_code"><![CDATA[ ## Create symlink so tool output lands where Galaxy expects it ln -s '$output_variable' 'expected_tool_output_name' && tool_command --input '$input' -o 'expected_tool_output_name' ]]></command>

Pattern: Prefix-based output naming

Some tools use --out-prefix where the output filename is prefix + input_filename . The tool constructs the filename internally, so you must predict it and symlink:

<command><![CDATA[ #set $mangled_input = re.sub(r"[^\w-\s]", "_", str($input.element_identifier)) + "." + str($input.ext) ln -s '$input' '$mangled_input' && ln -s '$output_var' 'myprefix${mangled_input}' && tool_command --input-reads '$mangled_input' -p myprefix ]]></command>

Key points:

  • Symlink is created before running the tool -- the tool writes through it

  • Must match the exact filename the tool will produce

  • For prefix mode: output = prefix + getFileName(input) , so mangle the input name to match

Using format_source for dynamic output formats

When output format should match the input format (e.g., subsampled reads):

<data name="subsampled_outfile" format_source="input_reads" label="Subsampled reads"> <filter>output_options["output_type"]["type_selector"] == "subsampled_reads"</filter> </data>

This is preferable to change_format when the output is always the same format as input. Use change_format when the user explicitly selects the output format.

XML Template Example

<tool id="tool_id" name="Tool Name" version="1.0.0"> <description>Brief description</description>

&#x3C;requirements>
    &#x3C;requirement type="package" version="1.0">package_name&#x3C;/requirement>
&#x3C;/requirements>

&#x3C;command detect_errors="exit_code">&#x3C;![CDATA[
    tool_command
        --input '$input'
        --output '$output'
        #if $optional_param
            --param '$optional_param'
        #end if
]]>&#x3C;/command>

&#x3C;inputs>
    &#x3C;param name="input" type="data" format="txt" label="Input file"/>
    &#x3C;param name="optional_param" type="text" optional="true" label="Optional parameter"/>
&#x3C;/inputs>

&#x3C;outputs>
    &#x3C;data name="output" format="txt" label="${tool.name} on ${on_string}"/>
&#x3C;/outputs>

&#x3C;tests>
    &#x3C;test>
        &#x3C;param name="input" value="test_input.txt"/>
        &#x3C;output name="output" file="expected_output.txt"/>
    &#x3C;/test>
&#x3C;/tests>

&#x3C;help>&#x3C;![CDATA[

What it does

Describe what the tool does.

Inputs

  • Input file: description

Outputs

  • Output file: description ]]></help>

    <citations> <citation type="doi">10.1234/example.doi</citation> </citations> </tool>

Supporting Documentation

This skill includes detailed reference documentation:

reference.md - Comprehensive Galaxy tool wrapping guide with IUC best practices

  • Repository structure standards

  • .shed.yml configuration

  • Complete XML structure reference

  • Advanced features and patterns

testing.md - Testing strategies and assertion patterns

  • Regenerating expected test outputs

  • Handling large test files (>1MB CI limit)

  • Size, checksum, and content sampling assertions

  • Workflow for replacing large test files

troubleshooting.md - Practical troubleshooting guide

  • Reading tool_test_output.json

  • Common exit codes and their meanings

  • Common XML and runtime issues

  • Debugging tool test failures

  • Test failure diagnosis and fixes

dependency-debugging.md - Dependency conflict resolution

  • Using planemo mull for diagnosis

  • Conda solver error interpretation

  • macOS testing considerations

  • Version conflict workflows

These files provide deep technical details that complement the core concepts above.

Related Skills

  • galaxy-automation - BioBlend & Planemo foundation (dependency)

  • galaxy-workflow-development - Building workflows that use these tools

  • conda-recipe - Creating conda packages for tool dependencies

  • bioinformatics-fundamentals - Understanding file formats and data types used in tools

Resources

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

token-efficiency

No summary provided by upstream source.

Repository SourceNeeds Review
General

bioinformatics-fundamentals

No summary provided by upstream source.

Repository SourceNeeds Review
General

folder-organization

No summary provided by upstream source.

Repository SourceNeeds Review
General

obsidian

No summary provided by upstream source.

Repository SourceNeeds Review