Multi-Agent Patterns Skill
Overview
This skill addresses multi-agent system design, covering scenarios where supervisor patterns, swarm architectures, or agent coordination strategies are needed. Core insight: "Sub-agents exist primarily to isolate context, not to anthropomorphize role division."
Quick Start
-
Identify need - Why multiple agents? (context limits, parallelism, specialization)
-
Choose pattern - Supervisor, peer-to-peer, or hierarchical
-
Design communication - Message passing, handoffs, state sharing
-
Implement safeguards - Validation, timeouts, conflict resolution
-
Monitor - Token usage, bottlenecks, failures
When to Use
-
Context window limits prevent single-agent solutions
-
Tasks benefit from parallel execution
-
Different domains require specialized knowledge
-
Complex workflows need coordination
-
Resilience through redundancy is required
Three Primary Patterns
- Supervisor/Orchestrator
Structure: Central coordinator delegates to specialists and synthesizes results.
[Supervisor]
/ | \
[Agent A] [Agent B] [Agent C] ↑ ↑ ↑ └────────┴─────────┘ Results flow up
Best for:
-
Tasks with clear decomposition
-
Human oversight needs
-
Sequential dependencies
-
Quality control requirements
Key consideration: The "telephone game problem" emerges when supervisors paraphrase sub-agent responses incorrectly.
Solution: Implement forward_message tool enabling direct sub-agent-to-user communication:
def forward_message(agent_id: str, message: str, to: str = "user"): """Forward agent message directly without supervisor interpretation.""" return {"from": agent_id, "message": message, "forwarded": True}
- Peer-to-Peer/Swarm
Structure: No central control; agents communicate directly through protocols.
[Agent A] ←→ [Agent B] ↑↓ ↑↓ [Agent C] ←→ [Agent D]
Best for:
-
Flexible exploration
-
Emergent problem-solving
-
Parallel processing
-
Resilient architectures
Key requirements:
-
Predefined communication protocols
-
Explicit handoff mechanisms
-
Shared state management
-
Conflict resolution rules
- Hierarchical
Structure: Layers of agents with strategy, planning, and execution tiers.
[Strategy Layer]
↓
[Planning Layer]
/ | \
[Exec A] [Exec B] [Exec C]
Best for:
-
Complex organizational workflows
-
Multi-level abstraction
-
Clear separation of concerns
-
Enterprise-scale systems
Layer responsibilities:
-
Strategy: Goals, priorities, resource allocation
-
Planning: Task decomposition, scheduling, coordination
-
Execution: Actual work, reporting, feedback
Token Economics
Reality check: Multi-agent systems consume ~15x baseline tokens compared to single-agent approaches.
Approach Token Multiplier Use Case
Single Agent 1x Simple, focused tasks
2-3 Agents 3-5x Moderate complexity
Full Swarm 10-20x Complex, parallel work
Optimization strategies:
-
Model selection often provides larger gains than more agents
-
Use smaller models for routine tasks
-
Reserve large models for synthesis and decisions
-
Implement aggressive context compression
Communication Patterns
Message Passing
class AgentMessage: sender: str recipient: str content: str message_type: Literal["request", "response", "broadcast"] requires_ack: bool = False
Handoff Protocol
class Handoff: from_agent: str to_agent: str context: dict # Compressed relevant state task: str expected_output: str timeout_seconds: int = 300
State Sharing
class SharedState: version: int last_updated: datetime data: dict lock_holder: Optional[str] = None
def acquire_lock(self, agent_id: str) -> bool: ...
def release_lock(self, agent_id: str) -> bool: ...
def update(self, agent_id: str, changes: dict) -> bool: ...
Implementation Guidance
Validation Requirements
-
Validate outputs before inter-agent transfer
-
Check message format and completeness
-
Verify agent capabilities before assignment
-
Validate state consistency after updates
Consensus Mechanisms
Mechanism Description Best For
Simple Majority
50% agreement Quick decisions
Weighted Voting Votes weighted by confidence Quality-sensitive
Quorum Minimum respondents required Fault tolerance
Leader Election Designated decision maker Speed
Recommendation: Implement weighted voting rather than simple majority:
def weighted_consensus(votes: List[Vote]) -> Decision: weighted_sum = sum(v.confidence * v.value for v in votes) total_weight = sum(v.confidence for v in votes) return Decision(value=weighted_sum / total_weight)
Safeguards
Execution TTL - Prevent infinite loops:
max_execution_time = 300 # seconds max_iterations = 100
Checkpoint Monitoring - Detect supervisor bottlenecks:
checkpoint_interval = 30 # seconds alert_threshold = 3 # missed checkpoints
Circuit Breaker - Handle cascading failures:
failure_threshold = 3 recovery_timeout = 60 # seconds
Best Practices
Do
-
Start with simplest pattern that works
-
Define explicit handoff protocols
-
Include state management from the start
-
Monitor token usage per agent
-
Implement graceful degradation
-
Log all inter-agent communication
Don't
-
Use multi-agent for single-agent problems
-
Assume agents will coordinate implicitly
-
Ignore token costs during design
-
Skip validation between agents
-
Create deeply nested hierarchies
-
Forget timeout handling
Error Handling
Error Cause Solution
Agent timeout Task too complex Break into subtasks, extend timeout
Conflicting outputs Ambiguous task Clarify requirements, add validation
Lost messages Network/state issues Implement acknowledgments, retry
Infinite loop Missing termination Add TTL, iteration limits
Supervisor bottleneck Too many reports Add intermediate aggregators
Metrics
Metric Target Description
Task completion rate
95% Successfully completed tasks
Token efficiency
0.5 Output value / tokens used
Coordination overhead <30% Tokens for coordination vs. work
Agent utilization
70% Active time vs. waiting
Error rate <5% Failed inter-agent operations
Pattern Selection Guide
Is context window sufficient? ├── Yes → Single agent └── No → Are tasks parallelizable? ├── Yes → Can agents work independently? │ ├── Yes → Peer-to-peer │ └── No → Supervisor with parallel workers └── No → Is there clear hierarchy? ├── Yes → Hierarchical └── No → Supervisor/Orchestrator
Related Skills
-
memory-systems - Cross-session persistence
-
parallel-dispatch - Concurrent agent execution
-
subagent-driven - Task execution pattern
Version History
- 1.0.0 (2026-01-19): Initial release adapted from Agent-Skills-for-Context-Engineering