eval-coding-agent
Evaluate a coding agent's output quality across the failure modes specific to code generation and editing: correctness, scope discipline, instruction following, safety, and diff quality. Use when building or improving a Claude-powered coding assistant, code review agent, or code generation pipeline.