Email Whiz
Gmail inbox management via MCP. Parallel-first, large-inbox optimized.
Canonical Vocabulary
| Term | Definition |
|---|---|
| 5-bucket | DO / DELEGATE / DEFER / REFERENCE / NOISE — five triage buckets (historically "5-bucket") |
| inbox zero | State where inbox contains only items requiring action today |
| fast-lane | Sender/subject-only classification without reading email content |
| security gate | Subject-keyword scan of NOISE before archiving (catches 2FA, password resets) |
| session cache | Local cache of Phase 0 results (1h TTL) at ~/.claude/email-whiz/session-cache.json |
| tier | Inbox size class: Small (<50) / Medium (50-500) / Large (500-5k) / Massive (5k+) |
| quick | Modifier prefix that skips Phase 0 for lightweight modes |
| combo | Chain two modes sharing discovered state: <mode> + <mode> |
| mega-wave | Single message with 10-15 independent tool calls bundled together |
| baseline | Rolling 30-day average inbox count; bankruptcy triggers are relative to this |
| auto-rule | Gmail filter created from behavioral pattern analysis |
| VIP | Sender who receives consistent replies and high-priority treatment |
| engagement | Reply rate to a sender: replies sent / emails received |
| confidence | HIGH >80% / MEDIUM 50-80% / LOW <50% — governs auto-rule actions |
Dispatch
$ARGUMENTS | Action |
|---|---|
triage | Categorize inbox with fast-lane + 5-bucket, batch process by category |
inbox-zero | Daily/weekly routine, progress tracking, baseline bankruptcy detection |
filters | Pattern-detected filter suggestions with confidence scoring |
auto-rules | Behavioral analysis → auto-create Gmail filters from templates |
analytics | Volume trends, response times, sender frequency, timing patterns |
newsletters | Subscription audit + unsubscribe plan |
labels | Taxonomy analysis, merge/rename/delete recommendations |
search <goal> | Natural language → Gmail search query |
senders | VIP + noise identification |
digest | Recent important email summary |
cleanup | Archive candidates, duplicate detection, batch archiving |
audit | Full inbox health report with score and action plan |
quick search or quick digest | Skip Phase 0, run mode directly. Only search and digest are quick-compatible |
<mode> + <mode> | Chain modes, sharing discovered state (triage + filters) |
| (empty) | Show this mode menu |
Classification gate: If $ARGUMENTS could match multiple modes, ask which the user wants before proceeding.
Hybrid Mode Protocol
Read operations — execute immediately:
gmail_search_emails, gmail_read_email, gmail_list_email_labels, gmail_list_filters, gmail_get_filter
Write operations — confirm before executing:
Single: gmail_modify_email, gmail_delete_email, gmail_create_label
Batch: gmail_batch_modify_emails, gmail_batch_delete_emails (MUST use Destructive Warning template — user types "DELETE" to confirm)
Labels: gmail_update_label, gmail_delete_label, gmail_get_or_create_label
Filters: gmail_create_filter, gmail_create_filter_from_template, gmail_delete_filter
Confirmation format: See templates.md. Always show scope, count, and reversibility before write ops.
Parallelization Rules
- Independent tool calls go in ONE message. Sequential independent calls is a bug.
- Phase 0 fuses with mode's first queries — never 2 sequential rounds when fusion is possible.
- Batch ops on disjoint ID sets: parallel. NOISE + REFERENCE + DEFER = 1 message, 3 calls.
- Analytics/Audit: mega-wave — 10-15 search queries in 1 message.
- gmail_read_email: batch 5-8 per message (full messages are large).
- Filter creation: parallel after conflict check passes.
- Combo mode boundary: sequential (data dependency). But skip Phase 0 for mode 2.
- Max 15-20 calls per message. Beyond that, a single 429 impacts the entire bundle.
- On 429: preserve completed results, drop to sequential with 2s gaps. 3x429 in 60s → pause 60s.
Details: efficiency-guide.md § Parallel Call Map.
Phase 0: Discovery (skip with quick prefix)
0a: Session Cache Check
!uv run python scripts/inbox_snapshot.py cache load
If valid (< 1h old) → use cached labels, filters, tier. Jump to mode.
0b: Parallel Discovery + Mode Fusion (cache miss)
Issue Phase 0 calls AND mode's first queries in a SINGLE message:
Core discovery (always):
gmail_list_email_labels→ label name/ID map + INBOXmessagesTotal/messagesUnreadgmail_list_filters→ existing filter criteriagmail_search_emails→"in:inbox is:unread"(maxResults: 10) → sample
Mode-specific additions (same message):
- Triage: +
gmail_search_emails "is:unread"(maxResults: per tier) - Inbox Zero: +
"is:unread newer_than:1d"+"label:_deferred" - Analytics: + 4x weekly volume queries (mega-wave)
- Audit: + newsletter count + sent count + overdue count (mega-wave)
- Cleanup: + 4x stale-detection queries
0c: Tier Assessment
Use INBOX label messagesTotal (NOT resultSizeEstimate — it is unreliable).
| Inbox Size | Tier | maxResults | Strategy |
|---|---|---|---|
| <50 | Small | 50 | Process all |
| 50-500 | Medium | 100 | Sample recent + unread |
| 500-5000 | Large | 200 | Importance-first, date-range splitting |
| 5000+ | Massive | 200 | Date-range parallel queries, baseline bankruptcy |
Why maxResults 100-200: MCP server has N+1 fetch pattern (each result triggers a separate API call). maxResults=500 ≈ 501 API calls ≈ 2505 quota units. 200 ≈ 1005 units.
Massive tier: Split by date-range (30d chunks), issue ALL chunks in parallel. Always query is:important is:unread alongside general scan. No pageToken on MCP server — date-range splitting is the ONLY pagination.
0d: Write Session Cache
!uv run python scripts/inbox_snapshot.py cache save --labels '...' --filters '...' --inbox-count N --unread-count M --tier TIER
Invalidate cache after any write operation (filter/label create, batch modify).
Use discovered label names throughout — never invent names that conflict with existing taxonomy.
Mode: Triage
Three-pass classification. Details: triage-framework.md § Fast-Lane, efficiency-guide.md.
Pass 1: Fast-Lane (zero content reads)
Classify by sender + subject + snippet ONLY from search results:
noreply@*,notifications@*,calendar-notification@*→ NOISE- Known VIP senders → DO
- Subject
[FYI]/[INFO]/newsletter/receipt/confirmation → REFERENCE - Subject
[ACTION]/[URGENT]/deadline → DO - Unmatched → UNDECIDED
Pass 1.5: Security Gate (mandatory before archiving NOISE)
Scan all NOISE subjects/snippets for security keywords: password reset, verify your, sign-in, unusual activity, 2FA, account locked, verification code, one-time password, security alert, new device, suspicious login, recovery phone, two-step verification, confirm your identity, new sign-in method. Any match → pull from NOISE to REVIEW. Also: any noreply@ email newer_than:7d → REVIEW (login notifications have no expiry; codes expire but should still surface).
Pass 2: Batch Operations (all in ONE message)
Precondition: Call gmail_get_or_create_label for _reference and _deferred if not already resolved in Phase 0.
Issue 3 gmail_batch_modify_emails calls simultaneously (disjoint ID sets = parallel safe):
- NOISE ids → remove INBOX
- REFERENCE ids → add
_reference, remove INBOX - DEFER ids → add
_deferred, remove INBOX
Pass 3: Content Inspection (UNDECIDED + REVIEW only)
Batch gmail_read_email 5-8 per message. Apply full 5-bucket. Batch-process results with same parallel pattern as Pass 2. Report with fast-lane triage summary (show auto-classified %). Surface recurring NOISE sender patterns as filter candidates.
Mode: Inbox Zero
Structured habit system. Details: inbox-zero-system.md.
Daily (5 min): Fuse Phase 0 + is:unread newer_than:1d + label:_deferred in 1 message. Quick triage pass (5-bucket). Batch archive NOISE + REFERENCE. Report: inbox count before → after. Update progress file.
Weekly (15 min): Issue 3 queries in 1 message: deferred sweep + filter effectiveness + newsletter sweep. Clear, re-defer, or archive. Update weekly_reviews.
Bankruptcy detection: Relative to 30-day baseline (not absolute). YELLOW: 20-50% above baseline. ORANGE: 50-100% above baseline (aggressive triage + filter creation). RED: >100% above baseline (full bankruptcy protocol). Acceleration trigger: week-over-week growth >20% for 2+ weeks. New users default to 500 until 7+ snapshots establish a baseline. See inbox-zero-system.md for recovery protocols.
Mode: Filters
Pattern-detected filter suggestions. Details: filter-patterns.md.
Issue sender clustering + subject pattern + gmail_list_filters ALL in 1 message. Score candidates: HIGH (volume ≥20, engagement <5%) → offer create. MEDIUM (≥10, <15%) → suggest. LOW → show only. Check conflicts before creating. After approval: issue multiple gmail_create_filter calls in 1 message. Massive tier: split 90d into 3x30d chunks, query in parallel.
Mode: Auto-Rules
Deep behavioral analysis → automated filter creation. Details: filter-patterns.md § Auto-Rule Detection.
Issue sender analysis + subject mining + filter fetch in 1 message. Massive tier: split 90d into 3x30d date-range chunks, issue ALL in parallel. Score by confidence. Present auto-rules report. HIGH: gmail_create_filter_from_template with confirmation. MEDIUM: individual review. Creation: multiple filters in 1 message. Schedule learning loop after 2 weeks.
Mode: Analytics
Email communication metrics via mega-wave. Details: analytics-guide.md.
Issue ALL queries in 1 message: 4x weekly volume + sent count + inbox reach + reply rate + overdue + sender frequency + label distribution (~12 calls, ~65% message reduction). Classify volume trend (growing/stable/declining). Compute inbox reach rate (good: <30%). Identify top 20 senders by volume + engagement matrix. Present full analytics report.
Mode: Newsletters
Subscription audit. Issue list:* + subject:newsletter in 1 message. Estimate frequency + read rate per subscription. Classify: UNSUBSCRIBE / KEEP / FILTER. Surface unsubscribe links. On approval: create filter or provide unsubscribe URL. Details: workflows.md § Unsubscribe Strategies.
Mode: Labels
Label taxonomy analysis. gmail_list_email_labels returns messagesTotal/messagesUnread per label. For 30d activity classification, issue per-label search queries in batches of 10-15. Find: duplicates/synonyms, stale labels (no messages 90d+), flat structure needing hierarchy. Suggest: MERGE / RENAME / DELETE / RESTRUCTURE. For merges: gmail_batch_modify_emails to migrate, then gmail_delete_label. Details: workflows.md § Label Hierarchy.
Mode: Search
Quick-compatible (no Phase 0). Natural language → Gmail search query. Parse intent (who/what/when/status), map to operators, add negations, show query + explanation + expected matches. Offer to run. Details: filter-patterns.md § Gmail Search Operators.
Mode: Senders
VIP + noise identification. Issue sample search + per-sender reply-rate queries in 1 message (if top-sender count > 12, split reply-rate queries across 2 messages to stay under call cap). VIP: reply rate >50% or consistently starred. Noise: volume >10, engagement 0%. Present sender report. On approval: create VIP filter (mark important) or noise filter. Details: triage-framework.md § Sender Shortcuts, workflows.md § VIP Management.
Mode: Digest
Quick-compatible (no Phase 0). Search is:important newer_than:3d (or is:starred OR is:unread is:important). Extract sender, subject, 1-line summary per email. Categorize: RESPOND / FYI / READING. Present digest report. Offer: "open N" | "archive fyi". Details: templates.md § Digest.
Mode: Cleanup
Archive candidates. Issue ALL 4 stale-detection queries in 1 message: receipts/confirmations older_than:30d + old newsletters older_than:30d + expired deferrals older_than:14d + fully-read no-label older_than:60d. Batch-archive ALL results in parallel gmail_batch_modify_emails calls. Never use gmail_batch_delete_emails — archive only. Details: workflows.md § Batch Operations.
Mode: Audit
Full inbox health via mega-wave. Issue 11 base queries in 1 message (per-label and per-sender queries run as a second wave): inbox count, unread %, filter count, inbox reach, reply rate, overdue, newsletter volume, deferred pile, sent volume, label health, sender top-20 (~70% message reduction). Score 0-100 based on inbox reach rate, filter coverage, label health. Identify top 3 quick wins. Generate action plan. Details: analytics-guide.md, templates.md § Full Audit.
Scope Boundaries
IS for: Gmail triage, inbox zero, filters, auto-rules, analytics, newsletters, labels, search, senders, digest, cleanup, audit. NOT for: Composing outbound emails, calendar, Google Drive, non-Gmail, real-time monitoring, CRM.
Reference File Index
| File | Content | Load When |
|---|---|---|
references/efficiency-guide.md | Cache, parallel calls, tiers, MCP constraints, fast-lane, pagination | All modes |
references/triage-framework.md | 5-bucket decision trees, fast-lane rules, security gate, batch processing | triage, inbox-zero |
references/filter-patterns.md | Gmail operators, filter templates, auto-rule algorithms, learning loop | filters, auto-rules, search |
references/workflows.md | Bankruptcy, VIP, batch ops, label hierarchy, unsubscribe, combo mode | cleanup, labels, senders, newsletters |
references/templates.md | All confirmation, report, and execution result templates | Every mode |
references/inbox-zero-system.md | Daily/weekly routines, progress schema, baseline bankruptcy | inbox-zero |
references/analytics-guide.md | Volume, response time, sender frequency, timing methodology | analytics, audit |
references/tool-reference.md | All 18 Gmail MCP tool signatures, parameters, MCP constraints | When selecting tools |
Scripts:
| Script | Purpose | Run When |
|---|---|---|
scripts/inbox_snapshot.py save --inbox-count N | Save inbox count snapshot | After each inbox-zero check |
scripts/inbox_snapshot.py trend --days 7 | Compute inbox trend | During analytics mode |
scripts/inbox_snapshot.py baseline --days 30 | Compute rolling baseline for bankruptcy | During inbox-zero |
scripts/inbox_snapshot.py cache load | Load session cache (1h TTL) | Phase 0 step 0a |
scripts/inbox_snapshot.py cache save --labels --filters --inbox-count --unread-count --tier | Write session cache | Phase 0 step 0d |
scripts/inbox_snapshot.py cache clear | Invalidate session cache | After write operations |
Critical Rules
- Run Phase 0 discovery before any workflow — never assume label/filter state (unless
quickmode) - Confirm all write operations — show scope, count, and reversibility before calling
- Batch similar operations — never loop
gmail_modify_emailwhengmail_batch_modify_emailsapplies - Preserve existing taxonomy — suggest improvements, never silently rename
- Check filter conflicts before creating — call
gmail_list_filtersfirst - Include confidence scores on all filter/rule suggestions
- Provide rollback guidance for every bulk operation
- Handle MCP failures gracefully — see
efficiency-guide.md§ Error Recovery - Surface unsubscribe links — make newsletter cleanup actionable
- Prioritize quick wins — high impact, low effort first in all reports
- Update inbox-zero progress after every routine
- Never use
gmail_batch_delete_emailsunless the user explicitly requests deletion — archive by default in all modes. Always use the Destructive Warning template (TYPE "DELETE" confirmation) - Issue ALL independent tool calls in a single message — sequential independent calls is a bug
- Use fast-lane classification before reading content — sender + subject resolve 60%+ of triage
- ALWAYS run the security gate on NOISE before archiving — scan subjects for 2FA/password/security keywords
- Check session cache before Phase 0. Invalidate cache after any write operation (
gmail_create_label,gmail_get_or_create_label,gmail_create_filter,gmail_create_filter_from_template,gmail_batch_modify_emails,gmail_batch_delete_emails). - maxResults 100-200 (not 500) — MCP N+1 pattern means 500 results ≈ 501 API calls
- Bankruptcy triggers are relative to 30-day baseline, not absolute thresholds