TokenKiller (Universal Throttling)

Goal

Systematically reduce token consumption without noticeably lowering success rate, applicable to agents with multiple capabilities (search/coding/debugging/testing/docs).

Task Complexity Assessment

Before setting budgets, assess task complexity:

Complexity	Criteria	Tool Budget	Output Budget
Simple	Single file modification, single-point localization, clear requirements	≤3 calls	≤50 lines
Medium	Across 2-3 files, needs simple exploration, relatively clear requirements	≤6 calls	≤120 lines
Complex	Cross-module refactoring, multi-step debugging, unclear requirements	≤10 calls	≤200 lines

Extension Mechanism (Soft Warning): When budget is about to run out but task is incomplete:

Output warning: [TokenKiller] Budget running low, current progress X/Y, remaining work: ...
Continue execution, but switch to more conservative strategy
User can interrupt or request more detailed output at any time

Default Working Mode (Balanced)

Global Hard Rules (Must Follow)

Goal First, Evidence Later: State the goal in one sentence (L0) first, then decide if evidence is needed (L2/L3).
Three-Question Limit: When clarification is needed, ask at most 3 questions at a time; otherwise proceed with "default assumptions" and mark replaceable points.
Progressive Disclosure: By default, only fetch "minimum necessary information"; never dump large files/full logs directly into context.
Diff-First: Prioritize outputting patches/changes/command and result summaries; avoid reposting entire files.
Deduped References: Information already seen should only be briefly referenced, not pasted again.

Budget Gate (Budget + Gate)

At the start of each task, assess complexity and set corresponding budget (see above "Task Complexity Assessment"), then execute gates:

Tool Call Budget: Set by complexity (Simple ≤3, Medium ≤6, Complex ≤10).
Read Budget: Single files read in full by default; large files >200 lines only read hit segments or in sections.
Output Budget: Set by complexity (Simple ≤50 lines, Medium ≤120 lines, Complex ≤200 lines).

If any gate is exceeded:

First narrow scope (path/file/module) → Then switch search strategy → Finally expand reading and output.

Token Consumption Self-Check

High-Consumption Behaviors (Avoid)

Reading >500 line files in full
Outputting complete file contents (should output diff)
Repeatedly pasting the same code/log
Listing entire directory trees
Outputting lengthy explanatory text

Self-Check Timing

After every 3 tool calls, quickly self-check:

Am I currently at L0-L2 level?
Is there duplicate information?
Is output exceeding necessary length?

Information Layers (L0-L3)

L0: One-sentence goal (required)
L1: At most 3 hard constraints (required)
L2: Evidence summary (file path + line number / key command output lines / key config items)
L3: Full long content (only pull in specific scenarios, see below "L3 Pull Scenarios")

Default output and context stay at L0-L2.

L3 Pull Scenarios (Explicit)

Only pull L3 (full content) in these scenarios:

Code Modification: When exact indentation/format matching is needed, read target function's complete code
Config Debugging: When config items are interdependent, need to see complete config block
Error Analysis: When error message is incomplete, need complete stack trace or context
User Explicit Request: User requests to see full content

Decision Flow: L2 Evidence → Attempt to proceed → Fail → Determine if L3 is needed → Pull minimum necessary range

Multi-Skill Collaboration

When this Skill is activated alongside other Skills:

Priority Rules

Functional Skills First: Specific rules of functional skills like pdf, xlsx take precedence
TokenKiller as Constraint Layer: During other skill execution, continuously apply budget and layer rules
User Priority on Conflict: User's explicitly requested output format/content takes precedence over throttling rules

Collaboration Mode

[User Request] → [Functional Skill Processing] → [TokenKiller Constrains Output]

Workflow (General)

1) Task Entry (Any Domain)

Produce L0 + L1 (quickly infer if user didn't provide)
Choose strategy (search/direct modification/verify first)
Execute minimal action
Immediately verify (cheapest verification first)
Summarize: only key conclusion + 1 next step

2) Search/Exploration (Priority Domain)

Priority:

Filename/Path (Glob)
Exact String (Grep)
Semantic Search (SemanticSearch)
Read File (Read, by sections/line ranges)

Rules:

Only read near hit points (±20 lines) or target function/component related paragraphs
Don't read through entire repository without localization

3) Coding/Refactoring

Rules:

Minimal change surface first: if 1 file can be changed, don't change 5
Avoid "rewrite everything"; prioritize reusing existing structure
After modification, immediately run cheapest verification (tsc/build/lint)
Only show key diffs (at most 1-3 code references)

4) Debugging/Troubleshooting

Rules:

First list 3 highest probability hypotheses (sorted by information gain)
Each time verify only 1 hypothesis, and only collect necessary evidence
Logs only take: error line, stack top, related config, reproduction command (rest summarized)

5) Testing/Verification

Priority (from cheap to expensive):

lint / typecheck
build
unit test
e2e / browser automation

When failed, only append "diff information", don't repost full output.

6) Docs/Summary

Rules:

Default to "short summary + next steps"
Don't restate user's original words; use structured point references
When docs are needed, use progressive disclosure: outline/points first, then expand details

Output Template (Default)

Use the following structure, unless user explicitly requests other format:

Conclusion: One sentence
Evidence: 2-5 items (path/line number/key command output)
Changes/Actions: What was done (at most 5 items)
Next Step: 1 item (most valuable next step)

Trigger Words (Recommended Auto-Enable)

Force enable this Skill when user mentions any of the following keywords/scenarios:

"waste token / save token / cost / context too long / log too long / repo too large / multi-step / agent"

tokenkiller

Safety Notice

Copy this and send it to your AI assistant to learn

TokenKiller (Universal Throttling)

Goal

Task Complexity Assessment

Default Working Mode (Balanced)

Global Hard Rules (Must Follow)

Budget Gate (Budget + Gate)

Token Consumption Self-Check

High-Consumption Behaviors (Avoid)

Self-Check Timing

Information Layers (L0-L3)

L3 Pull Scenarios (Explicit)

Multi-Skill Collaboration

Priority Rules

Collaboration Mode

Workflow (General)

1) Task Entry (Any Domain)

2) Search/Exploration (Priority Domain)

3) Coding/Refactoring

4) Debugging/Troubleshooting

5) Testing/Verification

6) Docs/Summary

Output Template (Default)

Trigger Words (Recommended Auto-Enable)

Source Transparency

Related Skills

Vibe Billing Scan

Cost Control

x402 Private Search

Tokenoptimizer