multi-agent-patterns

Multi-Agent Architecture Patterns

Multi-agent architectures distribute work across multiple LLM instances, each with its own context window. The critical insight: sub-agents exist primarily to isolate context, not to anthropomorphize role division.

Why Multi-Agent?

Context Bottleneck: Single agents fill context with history, documents, and tool outputs. Performance degrades via lost-in-middle effect and attention scarcity.

Token Economics:

Architecture Token Multiplier

Single agent chat 1× baseline

Single agent + tools ~4× baseline

Multi-agent system ~15× baseline

Parallelization: Research tasks can search multiple sources simultaneously. Total time approaches longest subtask, not sum.

Architectural Patterns

Pattern 1: Supervisor/Orchestrator

User Query -> Supervisor -> [Specialist, Specialist] -> Aggregation -> Output

Use when: Clear decomposition, coordination needed, human oversight important.

The Telephone Game Problem: Supervisors paraphrase sub-agent responses incorrectly.

Fix: forward_message tool lets sub-agents respond directly:

def forward_message(message: str, to_user: bool = True): """Forward sub-agent response directly to user.""" if to_user: return {"type": "direct_response", "content": message}

Pattern 2: Peer-to-Peer/Swarm

def transfer_to_agent_b(): return agent_b # Handoff via function return

agent_a = Agent(name="Agent A", functions=[transfer_to_agent_b])

Use when: Flexible exploration, rigid planning counterproductive, emergent requirements.

Pattern 3: Hierarchical

Strategy Layer -> Planning Layer -> Execution Layer

Use when: Large-scale projects, enterprise workflows, clear separation of concerns.

Context Isolation

Primary purpose of multi-agent: context isolation.

Mechanisms:

Full context delegation: Complex tasks needing full understanding
Instruction passing: Simple, well-defined subtasks
File system memory: Shared state without context bloat

Consensus and Coordination

Weighted Voting: Weight by confidence or expertise.

Debate Protocols: Agents critique each other's outputs. Adversarial critique often yields higher accuracy than collaborative consensus.

Trigger-Based Intervention:

Stall triggers: No progress detection
Sycophancy triggers: Mimicking without reasoning

Failure Modes

Failure Mitigation

Supervisor Bottleneck Output schema constraints, checkpointing

Coordination Overhead Clear handoff protocols, batch results

Divergence Objective boundaries, convergence checks

Error Propagation Output validation, retry with circuit breakers

Example: Research Team

Supervisor ├── Researcher (web search, document retrieval) ├── Analyzer (data analysis, statistics) ├── Fact-checker (verification, validation) └── Writer (report generation)

Best Practices

Design for context isolation as primary benefit
Choose pattern based on coordination needs, not org metaphor
Implement explicit handoff protocols with state passing
Use weighted voting or debate for consensus
Monitor for supervisor bottlenecks
Validate outputs before passing between agents
Set time-to-live limits to prevent infinite loops

multi-agent-patterns

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

agent-browser

agent-evaluation

crewai-agents