utility

Score candidate agent actions by expected gain, cost, uncertainty, and redundancy to guide dispatch and termination decisions

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "utility" with this command: npx skills add athola/nm-leyline-utility

Night Market Skill — ported from claude-night-market/leyline. For the full experience with agents, hooks, and commands, install the Claude Code plugin.

Utility Skill

Overview

A decision framework for agent orchestration based on Liu et al., "Utility-Guided Agent Orchestration for Efficient LLM Tool Use" (arXiv:2603.19896). Each candidate action is scored by subtracting weighted costs from expected gain, producing a single utility value that guides action selection. The framework prevents over-calling tools and premature stopping by making both errors costly. Utility range is [-2.3, 1.0].

When To Use

  • Deciding whether to dispatch another agent or tool call
  • Gating expensive tool calls (search, code execution, delegation)
  • Selecting the right model tier for a sub-task
  • Continuation decisions after receiving partial results
  • Verification gating before writing or committing output

When NOT to Use

  • Single-step operations with one obvious action
  • Trivial tasks where cost of scoring exceeds benefit
  • Already-committed actions that cannot be undone

Action Space

A = {respond, retrieve, tool_call, verify, delegate, stop}

ActionDescription
respondEmit a final answer from current context
retrieveFetch additional information (search, read, lookup)
tool_callExecute a tool (code runner, API, file write)
verifyCheck a prior result for correctness or completeness
delegateSpawn a sub-agent or hand off to a specialist
stopTerminate the loop and return current state

Utility Function

U(a | s_t) = Gain(a | s_t)
           - λ₁ · StepCost(a | s_t)
           - λ₂ · Uncertainty(a | s_t)
           - λ₃ · Redundancy(a | s_t)
ParameterDefaultRationale
λ₁1.0Cost baseline; all other weights relative to this
λ₂0.5Weak empirical correlation with outcome (r=0.0131)
λ₃0.8Redundancy pruning yields ~10% token savings

Utility range: [-2.3, 1.0]. Positive values indicate the action is worth taking. Values below the floor (-0.5 default) indicate the action should be skipped.

Termination Conditions

Stop the loop when any of the following is true:

  • (a) Selected action is stop
  • (b) Step budget exhausted (default: 10 steps)
  • (c) All non-stop actions score below the floor (default: -0.5)

High-gain override: If Gain >= 0.7 for any action, condition (c) may be overridden. Document the override and the gain value in your reasoning trace.

Quick Start

Minimal 4-step advisory pattern:

  1. Construct state -- gather task context per modules/state-builder.md
  2. Score candidates -- evaluate each action in A per modules/action-selector.md
  3. Prefer highest utility -- select the action with the maximum U(a | s_t), subject to termination conditions
  4. Log score and decision -- record the winning action, its utility value, and step count before executing

Detailed Resources

  • State Builder: modules/state-builder.md -- how to populate s_t from task context
  • Gain: modules/gain.md -- estimating expected information or progress gain
  • Step Cost: modules/step-cost.md -- token, latency, and monetary cost tables
  • Uncertainty: modules/uncertainty.md -- confidence estimation and calibration
  • Redundancy: modules/redundancy.md -- detecting duplicate or low-delta actions
  • Action Selector: modules/action-selector.md -- scoring loop and tie-breaking rules
  • Integration: modules/integration.md -- wiring utility scoring into existing orchestration loops

Exit Criteria

  • State constructed with task goal and prior steps
  • All six actions scored before selecting one
  • Termination condition checked after each step
  • Score and decision logged for each step taken
  • High-gain overrides documented with gain value

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

Canonry Setup

Agent-first AEO operating platform.

Registry SourceRecently Updated
4151arberx
Automation

Pilot Service Agents Entertainment

Games, manga/anime, trivia, and fandom APIs — PokeAPI, Jikan, CheapShark, misc. Use this skill when: 1. Pokémon / PokeAPI lookups 2. Anime or manga metadata...

Registry SourceRecently Updated
Automation

Pilot Service Agents Economics

Macroeconomic indicators — IMF DataMapper, World Bank, Eurostat SDMX, Coinbase reference prices. Use this skill when: 1. Country-level GDP, inflation, or une...

Registry SourceRecently Updated
Automation

Pilot Service Agents Flights

Aircraft tracking and aviation weather — ADS-B feeds (ICAO + bbox), airport directory, METAR/TAF/SIGMET. Use this skill when: 1. Live aircraft positions by I...

Registry SourceRecently Updated