agent-review

Review AI agent implementations for best practices in architecture, folder structure, design patterns, error handling, and observability. Use when auditing agent codebases or designing new agent systems.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "agent-review" with this command: npx skills add igbuend/grimbard/igbuend-grimbard-agent-review

Agent Implementation Review

Review AI agent implementations for architectural best practices.

Target: $ARGUMENTS (path to agent project or codebase)

When to Use This Skill

  • Auditing existing agent implementations
  • Designing new agent architectures
  • Reviewing agent code for production readiness
  • Evaluating multi-agent system designs
  • Assessing agent reliability and observability

Review Process

  1. Discover - Explore folder structure at $ARGUMENTS
  2. Analyze - Check against architecture patterns
  3. Evaluate - Score each category
  4. Report - Generate findings with recommendations

Folder Structure Best Practices

Recommended Agent Project Structure

agent-project/
├── src/
│   ├── agents/              # Agent definitions
│   │   ├── base.py          # Base agent class
│   │   ├── planner.py       # Planning agent
│   │   └── executor.py      # Execution agent
│   ├── tools/               # Tool implementations
│   │   ├── __init__.py
│   │   ├── base.py          # Tool base class/interface
│   │   ├── search.py        # Search tool
│   │   └── code.py          # Code execution tool
│   ├── memory/              # Memory/state management
│   │   ├── short_term.py    # Conversation context
│   │   ├── long_term.py     # Persistent storage
│   │   └── vector_store.py  # Embeddings/RAG
│   ├── prompts/             # Prompt templates
│   │   ├── system.py        # System prompts
│   │   └── templates/       # Jinja/string templates
│   ├── orchestration/       # Multi-agent coordination
│   │   ├── router.py        # Request routing
│   │   └── workflow.py      # Agent workflows
│   ├── models/              # Data models/schemas
│   │   ├── messages.py      # Message types
│   │   └── state.py         # State schemas
│   └── utils/               # Shared utilities
│       ├── logging.py       # Structured logging
│       └── retry.py         # Retry logic
├── config/                  # Configuration
│   ├── default.yaml         # Default settings
│   └── prompts/             # External prompt files
├── tests/                   # Test suite
│   ├── unit/
│   ├── integration/
│   └── fixtures/            # Test data
└── scripts/                 # CLI/automation

Structure Checklist

ComponentRequiredCheck
Agent definitions separatedYes[ ]
Tools in dedicated moduleYes[ ]
Prompts externalizedRecommended[ ]
Configuration separatedYes[ ]
Tests presentYes[ ]
Clear separation of concernsYes[ ]

Design Pattern Checklist

1. Tool Design

Required Patterns:

  • Tools have clear input/output schemas
  • Tool errors return structured error responses
  • Tools are stateless (no side effects on agent state)
  • Tool timeouts are configured
  • Tools validate inputs before execution

BAD:

def search(query):
    return requests.get(f"https://api.com?q={query}").json()

GOOD:

class SearchTool(BaseTool):
    name = "search"
    description = "Search the web for information"

    class InputSchema(BaseModel):
        query: str = Field(..., min_length=1, max_length=500)

    def execute(self, query: str) -> ToolResult:
        try:
            response = self.client.search(query, timeout=10)
            return ToolResult(success=True, data=response)
        except Timeout:
            return ToolResult(success=False, error="Search timed out")
        except Exception as e:
            return ToolResult(success=False, error=str(e))

2. Agent Loop

Required Patterns:

  • Clear think → act → observe cycle
  • Maximum iteration limit
  • Graceful termination conditions
  • State preserved between iterations
  • Interrupt/cancel capability

GOOD:

class Agent:
    MAX_ITERATIONS = 10

    async def run(self, task: str) -> AgentResult:
        state = AgentState(task=task)

        for i in range(self.MAX_ITERATIONS):
            if self._should_stop(state):
                break

            # Think
            action = await self.plan(state)

            # Act
            result = await self.execute(action)

            # Observe
            state = self.update_state(state, result)

        return self.finalize(state)

3. Memory Management

Required Patterns:

  • Conversation history with size limits
  • Summarization for long conversations
  • Clear memory lifecycle (create, read, update, delete)
  • Persistent storage for long-term memory
  • Vector store for semantic retrieval (if RAG)

Memory Types:

TypePurposePersistence
WorkingCurrent task contextSession
Short-termRecent conversationSession
Long-termUser preferences, factsPersistent
EpisodicPast task summariesPersistent
SemanticEmbeddings/RAGPersistent

4. Error Handling

Required Patterns:

  • Structured error types (not generic exceptions)
  • Retry with exponential backoff for transient errors
  • Graceful degradation (fallback behaviors)
  • Error context preserved for debugging
  • User-friendly error messages

Error Categories:

CategoryRetryAction
Rate limitYesExponential backoff
TimeoutYesRetry with longer timeout
Auth failureNoFail with clear message
Invalid inputNoReturn validation error
Tool failureMaybeTry alternative tool
Model errorYesRetry or fallback model

GOOD:

class AgentError(Exception):
    def __init__(self, message: str, code: str, recoverable: bool = False):
        self.message = message
        self.code = code
        self.recoverable = recoverable

@retry(
    retry=retry_if_exception_type(RateLimitError),
    wait=wait_exponential(multiplier=1, max=60),
    stop=stop_after_attempt(3)
)
async def call_model(self, messages: list) -> str:
    try:
        return await self.client.complete(messages)
    except RateLimitError:
        raise  # Let retry handle it
    except AuthError as e:
        raise AgentError("Authentication failed", "AUTH_ERROR", recoverable=False)

5. State Management

Required Patterns:

  • Immutable state updates (new state object per update)
  • State schema validation
  • State serialization for persistence
  • Clear state transitions
  • State versioning for migrations

GOOD:

@dataclass(frozen=True)
class AgentState:
    task: str
    messages: tuple[Message, ...]
    tool_results: tuple[ToolResult, ...]
    iteration: int = 0
    status: Literal["running", "completed", "failed"] = "running"

    def with_message(self, message: Message) -> "AgentState":
        return replace(self, messages=self.messages + (message,))

    def with_tool_result(self, result: ToolResult) -> "AgentState":
        return replace(self, tool_results=self.tool_results + (result,))

6. Multi-Agent Coordination

Patterns (if applicable):

  • Clear agent roles and responsibilities
  • Message passing protocol defined
  • Conflict resolution strategy
  • Supervisor/orchestrator pattern
  • Shared state management

Coordination Patterns:

PatternUse Case
SupervisorOne agent routes to specialists
PipelineSequential agent processing
DebateMultiple agents propose, one decides
SwarmAutonomous agents, shared goals
HierarchicalManager → workers structure

7. Prompt Management

Required Patterns:

  • System prompts externalized (not hardcoded)
  • Prompt versioning
  • Variables/templating for dynamic content
  • Prompt testing/validation
  • Clear prompt documentation

GOOD:

# prompts/system.yaml
agent_system_prompt:
  version: "1.2"
  template: |
    You are a helpful assistant with access to these tools:
    {% for tool in tools %}
    - {{ tool.name }}: {{ tool.description }}
    {% endfor %}

    Current date: {{ current_date }}
    User preferences: {{ user_prefs }}

8. Observability

Required Patterns:

  • Structured logging (JSON format)
  • Request/response tracing
  • Token usage tracking
  • Latency metrics
  • Error rate monitoring

Logging Checklist:

EventLog LevelRequired Fields
Agent startINFOtask_id, user_id, task
Tool callDEBUGtool_name, inputs, duration
Model callDEBUGmodel, tokens_in, tokens_out, latency
ErrorERRORerror_code, message, stack_trace
Agent completeINFOtask_id, status, total_duration, total_tokens

GOOD:

logger.info("agent_started", extra={
    "task_id": task_id,
    "user_id": user_id,
    "task_type": task.type,
})

logger.debug("tool_executed", extra={
    "task_id": task_id,
    "tool": tool.name,
    "duration_ms": duration,
    "success": result.success,
})

Configuration Best Practices

Required Configuration

SettingTypeDescription
modelstringModel identifier
max_iterationsintLoop limit
timeout_secondsintOverall timeout
tool_timeoutintPer-tool timeout
max_tokensintResponse limit
temperaturefloatModel temperature
retry_attemptsintRetry count

Configuration Hierarchy

1. Environment variables (secrets, deployment-specific)
2. Config files (default.yaml, production.yaml)
3. Code defaults (fallbacks only)

GOOD:

class AgentConfig(BaseSettings):
    model: str = "claude-3-sonnet"
    max_iterations: int = 10
    timeout_seconds: int = 300

    class Config:
        env_prefix = "AGENT_"
        env_file = ".env"

Testing Patterns

Test Categories

TypeCoveragePurpose
UnitTools, utilitiesIsolated component tests
IntegrationAgent + toolsEnd-to-end flows
SnapshotPromptsDetect prompt regressions
EvalAgent responsesQuality benchmarks

Required Tests

  • Tool input validation tests
  • Tool error handling tests
  • Agent termination condition tests
  • State transition tests
  • Prompt template rendering tests
  • Configuration loading tests

GOOD:

def test_search_tool_timeout():
    tool = SearchTool(timeout=0.001)
    result = tool.execute("test query")
    assert not result.success
    assert "timeout" in result.error.lower()

def test_agent_max_iterations():
    agent = Agent(max_iterations=3)
    # Mock tool that never completes
    agent.tools = [InfiniteLoopTool()]
    result = agent.run("impossible task")
    assert result.iterations == 3
    assert result.status == "max_iterations_reached"

Review Output Format

## Agent Review: [project-name]

### Summary
[1-2 sentence overview]

### Architecture Score

| Category | Score | Notes |
|----------|-------|-------|
| Folder Structure | X/5 | |
| Tool Design | X/5 | |
| Agent Loop | X/5 | |
| Memory Management | X/5 | |
| Error Handling | X/5 | |
| State Management | X/5 | |
| Observability | X/5 | |
| Testing | X/5 | |
| **Overall** | **X/5** | |

### Critical Issues
- [ ] [Issue] - Location: [file]

### Recommendations
- [ ] [Recommendation] - Priority: [High/Medium/Low]

### Strengths
- [What the implementation does well]

Anti-Patterns to Flag

Anti-PatternProblemFix
God AgentSingle agent does everythingSplit by responsibility
Infinite LoopNo termination conditionAdd max iterations
Silent FailuresErrors swallowedStructured error handling
Hardcoded PromptsPrompts in codeExternalize to files
No ObservabilityCan't debug productionAdd structured logging
Mutable StateRace conditions, bugsImmutable state updates
No TimeoutsHanging requestsConfigure all timeouts
Missing ValidationInvalid inputs acceptedSchema validation

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

missing-security-headers-anti-pattern

No summary provided by upstream source.

Repository SourceNeeds Review
Security

content-security-policy

No summary provided by upstream source.

Repository SourceNeeds Review
Security

oauth-security-anti-pattern

No summary provided by upstream source.

Repository SourceNeeds Review