Refactor

Review code changes in current branch against main, grade the quality, and iteratively refactor until the code meets a minimum quality threshold of 90/100.

Instructions

Get Changed Files

git diff main --name-only

Get the full diff for context:

git diff main

Run Vitest

Run Vitest in CI mode before grading:

pnpm vitest run --reporter=verbose

If tests fail, factor that into the grade (deduct from Test Coverage score). Don't run Playwright - ask the user to run those manually if needed.

Review and Grade

Review the code changes for:

Correctness: Does the code do what it's supposed to?
TypeScript: No any , proper types, strict mode compliance (see TypeScript Check below)
React patterns: Proper hooks usage, client/server boundaries
Code style: Follows project conventions (no semicolons, proper imports)
Error handling: Appropriate error boundaries and handling
Performance: No obvious N+1 queries, unnecessary re-renders
Security: No vulnerabilities (XSS, injection, etc.)
Accessibility: Proper aria attributes, semantic HTML
Test coverage: New logic/features have tests, edge cases covered (checks tests exist, doesn't run them). See Test Coverage Deep Check below.
DRY: No unnecessary duplication
Clarity: Code is readable and self-documenting

Also include all checks from /preflight (dead code, git hygiene, env vars, imports, dates, etc.). A score of 90+ means preflight would pass.

Assign a letter grade (A-F) and a score out of 100.

Branch Logic

If score >= 90

Output to terminal: "Good enough, let's ship it!"
List what could have made the score higher (but don't block on it)
Get the current branch name: git branch --show-current
Create tmp/ directory in project root if it doesn't exist
Write suggestions to tmp/{branch-name}.md with:
Final grade and score
Summary of what's good
List of potential improvements (nice-to-have)

If score < 90

Identify the issues bringing the score down
Refactor the code to fix those issues
After refactoring, re-review and grade again
Repeat until score >= 90
Then follow the >= 90 flow above

Grading Scale

Grade Score Range Meaning

A+ 97-100 Exceptional - production ready, exemplary code

A 93-96 Excellent - minor nitpicks only

A- 90-92 Very Good - meets quality bar

B+ 87-89 Good - small issues to address

B 83-86 Above Average - some improvements needed

B- 80-82 Satisfactory - multiple issues

C+ 77-79 Adequate - significant issues

C 73-76 Acceptable - needs work

C- 70-72 Below Average - substantial issues

D 60-69 Poor - major problems

F <60 Failing - fundamental issues

Output Format

Terminal Output (always)

Code Review: {branch-name}

Tests: Vitest ran in CI mode. Playwright skipped - run manually if needed.

Grade: {grade} ({score}/100)

Summary

{Brief description of what the changes do}

What's Good

{positive point 1}
{positive point 2}

Issues Found

{file}:{line} - {issue description}

Score Breakdown

Correctness: {x}/15
TypeScript: {x}/10
React Patterns: {x}/10
Code Style: {x}/10
Error Handling: {x}/10
Performance: {x}/10
Security: {x}/10
Test Coverage: {x}/10
Accessibility: {x}/5
DRY: {x}/5
Clarity: {x}/5

{If score >= 90: "Good enough, let's ship it!"} {If score < 90: "Refactoring to improve score..."}

Markdown File (tmp/{branch-name}.md)

Only written when score >= 90 (either initially or after refactoring):

Code Review: {branch-name}

Date: {date} Final Grade: {grade} ({score}/100)

Summary

{What the changes accomplish}

What's Good

{List of positive aspects}

Potential Improvements

{Things that could have pushed the score higher - nice to have, not blockers}

Refactoring History (if applicable)

{If refactoring was needed, list what was changed and score progression}

TypeScript Check

Some projects have legacy type errors in unrelated files. To check only changed files:

Check if 'types' target is available in make output

make 2>/dev/null | grep -q 'types' && make types || pnpm tsc --noEmit

Use make types if available (checks changed files only), otherwise fall back to pnpm tsc --noEmit .

Test Coverage Deep Check

Don't just check if new helper functions have tests - trace the code path:

Integration over unit: If you add a helper function getX() , find where it's called (e.g., calculateY() ) and verify calculateY() has tests covering the new logic
Follow the call chain: New code in utils/foo.ts → used by calculateBar() → verify calculateBar.test.ts covers the new behavior
Test the behavior, not just the function: A test for getMonthsInBillingInterval() returning 3 is good, but a test proving calculateLettingInvoiceCosts() actually multiplies by 3 for quarterly is better

Example gap to catch:

❌ "Added tests for getMonthsInBillingInterval() " (tests the helper)
✅ "Added tests for calculateLettingInvoiceCosts() with QUARTERLY billing" (tests the integration)

Edge Case Analysis

Actively question defensive code patterns:

Fallback values: When you see x ?? defaultValue or x ? fn(x) : fallback , ask:

What happens when the fallback is used?
Is the fallback correct for all scenarios?
Should this be an error instead of a silent default?

Optional chaining: When you see obj?.prop , trace what happens when obj is null:

Does downstream code handle the undefined correctly?
Are there tests for the null case?

Type narrowing: When interfaces allow null/undefined, verify the code handles both branches

Example to catch:

const months = billing?.interval ? getMonths(billing.interval) : 1 const cost = new Decimal(billing?.rate || 0).mul(months)

Ask: "If billing is null, we get 0 * 1 = 0 . Is that correct, or should this throw?"

Notes

Focus on meaningful improvements, not bikeshedding
Each refactoring iteration should target the highest-impact issues first
Don't over-engineer - fix what's wrong, don't add unnecessary complexity
If stuck in a loop (same issues keep appearing), break out and explain why
Maximum 5 refactoring iterations to prevent infinite loops

refactor

Safety Notice

Copy this and send it to your AI assistant to learn

Code Review: {branch-name}

Summary

What's Good

Issues Found

Score Breakdown

Code Review: {branch-name}

Summary

What's Good

Potential Improvements

Refactoring History (if applicable)

Check if 'types' target is available in make output

Source Transparency

Related Skills

infographic

visx

test-review

creative-design