semantic-model-router

Smart LLM Router — routes every query to the cheapest capable model. Supports 17 models across Anthropic, OpenAI, Google, DeepSeek & xAI (Grok). Uses a pre-trained ML classifier. No extra API keys required.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "semantic-model-router" with this command: npx skills add rayray1218/semantic-model-router

Semantic Model Router

Smart LLM router that saves up to 99% on inference costs by routing each request to the cheapest model that can handle it. Powered by a pre-trained ML classifier and semantic embeddings — no external calls, no API keys needed.

Install

openclaw plugins install @rayray1218/semantic-model-router

Quick Start

from scripts.model_router import ModelRouter

router = ModelRouter()
res = router.route("Design a distributed caching layer for a fintech platform.")
print(res["report"])
# [ClawRouter] anthropic/claude-sonnet-4-6 (ELITE, ml, conf=0.97)
#              Cost: $3.0/M | Baseline: $10.0/M | Saved: 70.0%

How Routing Works

Queries are classified into three tiers through a 3-stage pipeline:

  1. ML Classifier (primary): A Logistic Regression model trained on 6,000+ labeled queries. Runs in <1ms from embedded weights in model_weights.py.
  2. Semantic Embeddings (fallback): Cosine similarity to tier intent vectors via sentence-transformers.
  3. Keyword Rules (last resort): Pattern matching with no dependencies.
TierDefault ModelTypical WorkloadCost/1Mvs Baseline
BASICdeepseek/deepseek-chatGreetings, simple Q&A, chit-chat$0.1499% saved
BALANCEDopenai/gpt-4o-miniSummaries, translations, explanations$0.1599% saved
ELITEanthropic/claude-sonnet-4-6Complex coding, architecture, security$3.0070% saved

Supported Models (17 total, verified Feb 2026)

Anthropic

ModelInput /1MOutput /1M
anthropic/claude-sonnet-4-6$3.00$15.00 ★ ELITE default
anthropic/claude-opus-4-5$5.00$25.00
anthropic/claude-haiku-4-5$0.80$4.00

OpenAI

ModelInput /1MOutput /1M
openai/gpt-5$1.25$10.00
openai/gpt-4o$2.50$10.00
openai/gpt-4o-mini$0.15$0.60 ★ BALANCED default
openai/o3$2.00$8.00
openai/o4-mini$1.10$4.40

Google

ModelInput /1MOutput /1M
google/gemini-3.0-pro$1.25$10.00
google/gemini-2.5-pro$1.25$10.00
google/gemini-2.5-flash$0.30$2.50
google/gemini-2.5-flash-lite$0.10$0.40

DeepSeek

ModelInput /1MOutput /1M
deepseek/deepseek-chat (V3.2)$0.28$0.42 ★ BASIC default
deepseek/deepseek-reasoner (V3.2)$0.28$0.42

xAI (Grok)

ModelInput /1MOutput /1M
xai/grok-3$3.00$15.00
xai/grok-3-mini$0.30$0.50

Pricing source: Official API docs of each provider, verified Feb 2026.

Override Models at Runtime

# Use GPT-5.2 for ELITE, Gemini Flash Lite for BASIC
router = ModelRouter(
    elite_model="openai/gpt-5.2",
    balanced_model="google/gemini-2.5-flash",
    basic_model="google/gemini-2.5-flash-lite",
)
# Swap a tier's model without recreating the router
router.set_model("ELITE", "anthropic/claude-opus-4-5")

List All Available Models (CLI)

python3 scripts/model_router.py --list-models

CLI Usage

# Route a single query
python3 scripts/model_router.py "Implement AES encryption from scratch"

# Override ELITE model
python3 scripts/model_router.py --elite openai/gpt-5.2 "Write a compiler"

# Run full smoke-test
python3 scripts/model_router.py

Dynamic Keyword Expansion

router.add_keywords("ELITE", ["cryptographic proof", "zero-knowledge"])

Example Output

Query                                              Predicted  Expected   ✓  Cost Info
────────────────────────────────────────────────────────────────────────────────────
How are you doing today?                           BASIC      BASIC      ✓  $0.14/M  saved 98.6%
Summarize this article in three bullet points.     BALANCED   BALANCED   ✓  $0.15/M  saved 98.5%
Implement a thread-safe LRU cache in Python.       ELITE      ELITE      ✓  $3.0/M   saved 70.0%

Security & Privacy

  • Zero external calls: All classification runs locally.
  • No API keys: The router itself needs none.
  • Transparent weights: All model parameters live in scripts/model_weights.py — fully auditable.

Save costs, route smarter. Built for the OpenClaw community.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

A.I. Smart Router

Expertise-aware model router with semantic domain scoring, context-overflow protection, and security redaction. Automatically selects the optimal AI model using weighted expertise scoring (Feb 2026 benchmarks). Supports Claude, GPT, Gemini, Grok with automatic fallback chains, HITL gates, and cost optimization.

Registry SourceRecently Updated
2.1K3Profile unavailable
Automation

OpenClaw 集中配置管理系统

为 OpenClaw 构建集中化配置管理系统,告别硬编码和配置分散,实现"改一处,生效全局"的现代化运维体验。包含配置加载器、主配置融合、记忆同步、AGENTS.md 模板、memoryFlush、memorySearch、多 Agent 配置、ClawRouter 成本优化等核心功能。

Registry SourceRecently Updated
1631Profile unavailable
Automation

Openclaw Router

Intelligent Model Routing - Save 60% on AI Costs / 智能路由系统 - 节省 60% 成本

Registry SourceRecently Updated
1930Profile unavailable
Automation

Keep Protocol

Signed Protobuf packets over TCP for AI agent-to-agent communication. Now with MCP tools for sub-second latency! Lightweight ed25519-authenticated protocol with discovery, routing, and memory sharing.

Registry SourceRecently Updated
2.3K2Profile unavailable