Token Guard

# TokenGuard — LLM API 429 Prevention Engine

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Token Guard" with this command: npx skills add edmonddantesj/token-guard

TokenGuard — LLM API 429 Prevention Engine

<!-- 🌌 Aoineco-Verified | S-DNA: AOI-2026-0213-SDNA-TG01 -->

Version: 1.5.0
Author: Aoineco & Co.
License: MIT
Tags: rate-limit, 429, token-management, cost-optimization, llm-guard, high-performance

Description

Prevents LLM API 429 (Rate Limit / Resource Exhausted) errors by intercepting requests before they're sent. Designed for users on free/low-cost API plans who need maximum intelligence per dollar.

Core philosophy: "Intelligence is measured not by how much you spend, but by how little you need."

Problem

When using LLM APIs (especially Google Gemini Flash with 1M TPM limit):

  • Large documents (docx, PDFs) can consume the entire minute quota in one request
  • Failed requests still count toward token usage
  • Retry loops after 429 errors waste more tokens → death spiral
  • No built-in way to detect runaway/duplicate requests

Features

FeatureDescription
Pre-flight Token EstimationEstimates token count before API call (CJK-aware, no tiktoken dependency)
Real-time Quota TrackingTracks per-model per-minute token usage with sliding window
Smart ThrottleAuto-waits when quota > 80%, blocks at > 95%
Duplicate DetectionBlocks identical requests within 60s window (3+ = runaway)
Response CachingCaches successful responses for duplicate requests
Auto Model FallbackSwitches to cheaper/available model when primary is exhausted
429 Error ParserExtracts exact retry delay from Google/Anthropic error responses
Batch vs Mistake DetectionDistinguishes intentional bulk processing from error loops

Supported Models

Pre-configured quotas for:

  • gemini-3-flash (1M TPM)
  • gemini-3-pro (2M TPM)
  • claude-haiku (50K TPM)
  • claude-sonnet (200K TPM)
  • claude-opus (200K TPM)
  • gpt-4o (800K TPM)
  • deepseek (1M TPM)

Custom quotas can be added for any model.

Usage

from token_guard import TokenGuard

guard = TokenGuard()

# Before every API call:
decision = guard.check(prompt_text, model="gemini-3-flash")

if decision.action == "proceed":
    response = call_your_api(prompt_text)
    guard.record_usage(decision.estimated_tokens, model="gemini-3-flash")
    guard.cache_response(prompt_text, response)

elif decision.action == "wait":
    time.sleep(decision.wait_seconds)
    # retry

elif decision.action == "fallback":
    response = call_your_api(prompt_text, model=decision.fallback_model)

elif decision.action == "block":
    print(f"Blocked: {decision.reason}")

# If you get a 429 error:
guard.record_429("gemini-3-flash", retry_delay=53.0)

Integration with OpenClaw

Add to your agent's config or use as a middleware:

skills:
  - token-guard

The agent can invoke TokenGuard before any LLM API call to prevent quota exhaustion.

File Structure

token-guard/
├── SKILL.md          # This file
└── scripts/
    └── token_guard.py  # Main engine (zero external dependencies)

Status Output Example

{
  "models": {
    "gemini-3-flash": {
      "tpm_limit": 1000000,
      "used_this_minute": 750000,
      "remaining": 250000,
      "usage_pct": "75.0%",
      "status": "🟢 OK"
    }
  },
  "stats": {
    "total_checks": 42,
    "tokens_saved": 128000,
    "blocks": 3,
    "fallbacks": 2
  }
}

Zero Dependencies

Pure Python 3.10+. No pip install needed. No tiktoken, no external API calls. Designed for the $7 Bootstrap Protocol — every byte counts.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Video Using Ai

create raw footage into AI-edited videos with this skill. Works with MP4, MOV, AVI, WebM files up to 500MB. content creators use it for turning raw footage i...

Registry SourceRecently Updated
General

Editor With Ai

Skip the learning curve of professional editing software. Describe what you want — cut out the pauses, add transitions, and export a clean final cut — and ge...

Registry SourceRecently Updated
General

To Voice Generator

Get voiced video files ready to post, without touching a single slider. Upload your text or script (TXT, DOCX, PDF, MP4, up to 200MB), say something like "co...

Registry SourceRecently Updated
General

Tencent Cloud Rum

Query Tencent Cloud RUM data, analyze Web performance (LCP/FCP/WebVitals), troubleshoot JS/Promise errors, analyze API latency & error rates, diagnose slow s...

Registry SourceRecently Updated