Xrouter

Xrouter is an open-source inference router that sits between OpenClaw and your LLM providers. It uses a fast, hardware-aware classifier to route each request to the most cost-effective model that can handle the task.

This project is MIT licensed. See the MIT License.

Core Features

OpenAI-compatible reverse proxy at POST /v1/chat/completions.
3-tier classifier (0 = cheap, 1 = medium, 2 = frontier) with early stream cutoff.
Hardware detection helper to recommend local engine.
Provider selection wizard to choose local and cloud endpoints.
Cache layer with Redis or in-memory LRU fallback.
Full cloud mode when local inference is not viable.
Token tracking dashboard at /dashboard.

Workflow

flowchart TD
  A["Client / OpenClaw request"] --> B["Router (OpenAI-compatible)"]
  B --> C{"Classifier enabled?"}
  C -->|No| F["Route to Frontier provider"]
  C -->|Yes| D["Classifier (0 / 1 / 2)"]
  D --> E{"Decision"}
  E -->|0| G["Route to Cheap provider"]
  E -->|1| M["Route to Medium provider"]
  E -->|2| F
  G --> H["Provider adapter (auto or explicit)"]
  M --> H
  F --> H
  H --> I["Upstream API call"]
  I --> J["Stream/Response back to client"]

Repository Layout

src/server.js: router and streaming proxy.
src/classifier.js: classifier call and retry logic.
src/config.js: configuration and env parsing.
src/cache.js: Redis + LRU cache.
src/token_tracker.js: token tracking.
scripts/check_hw.js: hardware detection.
scripts/configure_providers.js: interactive provider setup.

Requirements

Node.js 20+.
Local classifier engine (optional).
A frontier provider endpoint (required).

Quickstart

Install dependencies.
(Optional) Start a local model server.
Run the configuration wizard.
Start the router.

npm install
npm run configure
npm run dev

How To Use

Start your local model server (optional but recommended).
Run the wizard to configure providers and models.
Start the router.
Send OpenAI-compatible requests to the router.
Inspect routing decisions in response headers or the dashboard.

Example local setup (Ollama):

ollama pull llama3.1
ollama run llama3.1

Run the wizard:

npm run configure

Start the router:

npm run dev

Test a request:

curl -i http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"any","messages":[{"role":"user","content":"Fix this sentence: I has a apple."}]}'

Look for these headers:

X-Xrouter-decision: 0, 1, or 2.
X-Xrouter-upstream: cheap, medium, or frontier.

Open the dashboard:

http://localhost:3000/dashboard

Raw usage JSON:

http://localhost:3000/usage

Provider Selection (Terminal Wizard) Run:

npm run configure

The wizard:

Scans hardware and recommends a local engine.
Suggests a local classifier model.
Lets you choose provider base URLs, API keys, and model overrides for cheap/medium/frontier routes.
Writes upstreams.json and optionally updates .env.

Quick Start Mode

If your machine can run a local model, you can choose Quick Start.
Quick Start auto-configures the local classifier.
Cheap route always uses the same local model as the classifier to avoid Ollama model swapping.
You only need to choose medium and frontier providers/models.
On Apple Silicon (Ollama), the wizard lists installed Ollama models and can auto-download a recommended model.

Routing Behavior

The classifier is called for each uncached request.
The first 0, 1, or 2 token returned decides the route.
If classification fails, the router defaults to the frontier route.
When the classifier is enabled, cheap, medium, and frontier routes must be configured.

Compatibility

The router accepts OpenAI-style requests and translates when needed.
Provider type can be explicit (xrouter, openai_compatible, openai, anthropic, gemini, cohere, azure_openai, mistral, groq, together, perplexity) or auto.
auto infers the provider adapter from the base URL or API key.
Providers that expose OpenAI-compatible endpoints use the openai_compatible adapter.
Anthropic/Gemini/Cohere streaming is translated into OpenAI-style SSE chunks.
Non-OpenAI adapters currently support text-only messages and basic sampling params (temperature/top_p/stop).

Token Tracking Dashboard

GET /usage: returns cumulative token usage for cheap, medium, and frontier.
GET /dashboard: UI that displays token split and totals.
Local usage is counted inside cheap when cheap uses the local model.

Environment Summary

HOST: bind host, default 0.0.0.0.
PORT: bind port, default 3000.
ROUTER_API_KEY: require Authorization: Bearer <key>.
LOG_LEVEL: log level (debug/info/warn/error).
LOG_TO_FILE: set true to write logs to files.
LOG_DIR: directory for log files (default ./logs).
CLASSIFIER_ENABLED: set false to disable local classification.
CLASSIFIER_BASE_URL: OpenAI-compatible classifier endpoint.
CLASSIFIER_MODEL: classifier model name.
CLASSIFIER_SYSTEM_PROMPT: classifier prompt (single line).
CLASSIFIER_TIMEOUT_MS: classifier timeout.
CLASSIFIER_FORCE_STREAM: force streaming classifier request.
CLASSIFIER_WARMUP: warm the classifier on server start.
CLASSIFIER_WARMUP_DELAY_MS: delay before warmup request (ms).
CLASSIFIER_KEEP_ALIVE_MS: keep-alive interval for classifier warmup (ms).
CLASSIFIER_LOADING_RETRY_MS: delay between retries when the model is loading.
CLASSIFIER_LOADING_MAX_RETRIES: max retries when the model is loading.
CHEAP_BASE_URL: optional, defaults to classifier base URL.
CHEAP_API_KEY: cheap provider API key.
CHEAP_MODEL: optional model override for cheap route.
CHEAP_PROVIDER: provider type for cheap route (auto if empty).
CHEAP_HEADERS: optional JSON headers for cheap provider (stringified object).
CHEAP_DEPLOYMENT: Azure deployment override for cheap route.
CHEAP_API_VERSION: Azure API version override for cheap route.
MEDIUM_BASE_URL: required when classifier is enabled.
MEDIUM_API_KEY: medium provider API key.
MEDIUM_MODEL: optional model override for medium route.
MEDIUM_PROVIDER: provider type for medium route (auto if empty).
MEDIUM_HEADERS: optional JSON headers for medium provider (stringified object).
MEDIUM_DEPLOYMENT: Azure deployment override for medium route.
MEDIUM_API_VERSION: Azure API version override for medium route.
FRONTIER_BASE_URL: OpenAI-compatible frontier endpoint.
FRONTIER_API_KEY: frontier API key.
FRONTIER_MODEL: optional model override for frontier route.
FRONTIER_PROVIDER: provider type for frontier route (auto if empty).
FRONTIER_HEADERS: optional JSON headers for frontier provider (stringified object).
FRONTIER_DEPLOYMENT: Azure deployment override for frontier route.
FRONTIER_API_VERSION: Azure API version override for frontier route.
REDIS_URL: if set, enables Redis cache.

Local Model Installation & Run Guides Ollama (best for Mac, easiest cross-platform)

Install: Ollama Quickstart
Pull a model: ollama pull llama3.1
Run: ollama run llama3.1
Base URL: http://localhost:11434
Router config: CLASSIFIER_BASE_URL=http://localhost:11434 CLASSIFIER_MODEL=llama3.1

vLLM (NVIDIA GPU)

OpenAI server: vLLM OpenAI Server
Example: vllm serve NousResearch/Meta-Llama-3-8B-Instruct --dtype auto --api-key token-abc123
Base URL: http://localhost:8000
Router config: CLASSIFIER_BASE_URL=http://localhost:8000 CLASSIFIER_MODEL=NousResearch/Meta-Llama-3-8B-Instruct

TensorRT-LLM (NVIDIA, max speed)

Repo: TensorRT-LLM
Server: trtllm-serve
Base URL: http://<host>:<port>
Router config: CLASSIFIER_BASE_URL=http://<host>:<port> CLASSIFIER_MODEL=<your model>

llama.cpp (CPU/AMD fallback)

Repo: llama.cpp
Example: llama-server -m model.gguf --port 8080
Base URL: http://localhost:8080
Router config: CLASSIFIER_BASE_URL=http://localhost:8080 CLASSIFIER_MODEL=<gguf model name>

Docker Build and run the router with Redis:

docker compose -f deploy/docker-compose.yml up --build

Hardware Detection Run:

npm run check-hw

This prints the recommended engine:

tensorrt-llm for large NVIDIA GPUs.
vllm for standard NVIDIA GPUs.
mlx for Apple Silicon.
llama.cpp for CPU/AMD fallback.

Model List Fetching

The wizard queries provider model list endpoints when possible.
OpenAI-compatible: /v1/models
Anthropic: /v1/models
Gemini: /v1beta/models
Cohere: /v1/models
If listing fails, the wizard falls back to scripts/cloud_model_catalog.json.

Star History

Clawhub Skill

Safety Notice

Copy this and send it to your AI assistant to learn

Xrouter

Source Transparency

Related Skills

Zoom

Kleinanzeigen.de Helper

Poku

IMAP/SMTP Email - Maddy Fix