llm-supervisor

Graceful rate limit handling with Ollama fallback. Notifies on rate limits, offers local model switch with confirmation for code tasks.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "llm-supervisor" with this command: npx skills add dhardie/llm-supervisor

LLM Supervisor 🔮

Handles rate limits and model fallbacks gracefully.

Behavior

On Rate Limit / Overload Errors

When I encounter rate limits or overload errors from cloud providers (Anthropic, OpenAI):

  1. Tell the user immediately — Don't silently fail or retry endlessly
  2. Offer local fallback — Ask if they want to switch to Ollama
  3. Wait for confirmation — Never auto-switch for code generation tasks

Confirmation Required

Before using local models for code generation, ask:

"Cloud is rate-limited. Switch to local Ollama (qwen2.5:7b)? Reply 'yes' to confirm."

For simple queries (chat, summaries), can switch without confirmation if user previously approved.

Commands

/llm status

Report current state:

  • Which provider is active (cloud/local)
  • Ollama availability and models
  • Recent rate limit events

/llm switch local

Manually switch to Ollama for the session.

/llm switch cloud

Switch back to cloud provider.

Using Ollama

# Check available models
ollama list

# Run a query
ollama run qwen2.5:7b "your prompt here"

# For longer prompts, use stdin
echo "your prompt" | ollama run qwen2.5:7b

Installed Models

Check with ollama list. Configured default: qwen2.5:7b

State Tracking

Track in memory during session:

  • currentProvider: "cloud" | "local"
  • lastRateLimitAt: timestamp or null
  • localConfirmedForCode: boolean

Reset to cloud at session start.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Gpu Cluster Manager

Turn your spare GPUs into one inference endpoint. Auto-discovers machines on your network, routes requests to the best available device, learns when your mac...

Registry SourceRecently Updated
Coding

Local Llm Router

Local LLM model router for Llama, Qwen, DeepSeek, Phi, Mistral, and Gemma across multiple devices. Self-hosted local LLM inference routing on macOS, Linux, a...

Registry SourceRecently Updated
Coding

yuhang

一个"制造技能的技能"。这个工具自动化了将任意 GitHub 仓库转换为标准化 Trae 技能的全过程,是扩展 AI Agent 能力的核心工具。

Registry SourceRecently Updated
Coding

Venn Nino

Safely connects Gmail, Calendar, Drive, Atlassian (Jira/Confluence), Notion, GitHub, Salesforce, and dozens of other enterprise tools via a single MCP endpoi...

Registry SourceRecently Updated