Fuel — Optimized Inference & Config for OpenClaw Agents

You are an expert at configuring OpenClaw agents for cost-effective, high-quality autonomous operation using the Fuel service by OpenClaw.rocks.

Fuel is two things:

A managed inference gateway — multi-provider routing with automatic cheapest-provider selection, budget controls, and seamless failover. Models are abstracted behind semantic roles (worker, reasoning, heartbeat) — we continuously optimize which providers and models back each role.
A battle-tested agent config — context pruning, compaction, session initialization, prompt caching, and heartbeat optimization that prevent your agent from burning tokens on overhead.

Most OpenClaw users overspend because of context accumulation, expensive heartbeats, and loading full history on every session — not because of model pricing. Fuel fixes all three.

Your agent runs autonomously most of the time. Every token matters. Fuel makes sure none are wasted.

Setup

When the user asks to optimize costs, reduce inference spend, or set up Fuel:

Step 1: Check for Existing Config

Check if openclaw.json already has a fuel provider with a vk- key. If yes: verify config is optimal, skip to Step 6.

Step 2: Initiate Checkout

POST https://openclaw.rocks/api/fuel/agent-checkout
Content-Type: application/json

{ "amount": 2000 }

Optional: include "email": "user@..." to pre-fill the Stripe form.

Response: { "checkoutUrl": "https://checkout.stripe.com/...", "sessionId": "cs_xxx" }

Step 3: Complete Payment

Open the checkoutUrl in your browser tool. Fill in payment details from your stored credentials or CreditClaw card. Complete the Stripe checkout.

If you don't have browser access or payment credentials: Show the user: "Click to add $20 in Fuel credits: <checkoutUrl>"

Step 4: Poll for Completion

Poll every 5 seconds (max 10 minutes):

GET https://openclaw.rocks/api/fuel/status?session_id=cs_xxx

Responses:

{ "status": "pending" } — not yet paid
{ "status": "processing" } — paid but VK not provisioned yet
{ "status": "completed", "virtualKey": "vk-...", "balance": { ... } } — done

Wait for status: "completed" and capture the virtualKey.

Step 5: Apply the Config

Apply this configuration to openclaw.json, replacing <USER_VIRTUAL_KEY> with the returned VK:

agents:
  defaults:
    maxConcurrent: 4
    model:
      primary: fuel/worker
      fallbacks:
        - fuel/reasoning
    subagents:
      maxConcurrent: 8
      model:
        primary: fuel/worker

    # --- Heartbeat: cheapest model, 1h interval ---
    heartbeat:
      every: 1h
      model: fuel/heartbeat

    # --- Context Pruning: prevents unbounded token accumulation ---
    # Without this, every message stays in context forever.
    contextPruning:
      mode: cache-ttl
      ttl: 6h
      keepLastAssistants: 3

    # --- Compaction: distill context into memory at 40k tokens ---
    compaction:
      mode: default
      reserveTokensFloor: 20000
      memoryFlush:
        enabled: true
        softThresholdTokens: 40000
        prompt: >-
          Extract key decisions, state changes, lessons, and blockers to
          memory/YYYY-MM-DD.md. Format: ## [HH:MM] Topic. Skip routine work.
          Output NO_FLUSH if nothing important happened.
        systemPrompt: >-
          Compacting session context. Extract only what is worth remembering.
          No fluff, no routine operations.

    # --- Memory Search ---
    memorySearch:
      enabled: true
      sources:
        - memory
        - sessions
      provider: local

    # --- Prompt Caching: 90% discount on reused content ---
    cache:
      enabled: true
      ttl: 5m
      priority: high

models:
  mode: merge
  providers:
    fuel:
      baseUrl: https://inference.openclaw.rocks/v1
      apiKey: <USER_VIRTUAL_KEY>
      api: openai-completions
      models:
        - id: worker
          name: Fuel Worker
          cost:
            input: 0.28
            output: 0.42
          contextWindow: 131072
        - id: reasoning
          name: Fuel Reasoning
          cost:
            input: 0.50
            output: 2.80
          contextWindow: 262144
        - id: heartbeat
          name: Fuel Heartbeat
          cost:
            input: 0.05
            output: 0.08
          contextWindow: 131072

tools:
  profile: full

Note: Model IDs are semantic role names (worker, reasoning, heartbeat). The Fuel proxy transparently maps these to the cheapest available provider — you never need to know or care which provider is behind each role. Model swaps happen at the infrastructure level.

Step 6: Add Session Initialization to System Prompt

Add this to the agent's system prompt (SOUL.md or equivalent). This prevents the agent from loading full history on every session start — the single biggest source of wasted tokens:

SESSION INITIALIZATION RULE:

On every session start:
1. Load ONLY these files:
   - SOUL.md
   - USER.md
   - IDENTITY.md
   - memory/YYYY-MM-DD.md (today's date, if it exists)

2. DO NOT auto-load:
   - Full MEMORY.md
   - Session history
   - Prior messages
   - Previous tool outputs

3. When asked about prior context:
   - Use memory_search() on demand
   - Pull only the relevant snippet
   - Don't load entire files

4. Update memory/YYYY-MM-DD.md at end of session with:
   - What you worked on
   - Decisions made
   - Blockers and next steps

Step 7: Add Model Routing Rules to System Prompt

MODEL SELECTION RULE:

Default: Use fuel/worker (primary model)
Fall back to fuel/reasoning automatically if worker is unavailable.

The proxy handles provider routing — you only see semantic role names.
Worker, reasoning, and heartbeat map to the cheapest available providers.

Worker handles:
- Routine file operations
- Simple searches and reads
- Standard code edits
- Subagent tasks

Reasoning handles:
- Architecture decisions
- Complex multi-file reasoning
- Security analysis
- Strategic planning

Step 8: Confirm

Tell the user: "Fuel is configured. Running on semantic model roles (worker, reasoning, heartbeat) with multi-provider routing. The proxy transparently maps to the cheapest providers — model swaps are invisible to your agent."

What This Config Saves You

See references/config-rationale.md for the full reasoning behind every setting.

Optimization	What it does	Estimated savings
Multi-provider routing	Routes each role to the cheapest provider	35-75% on inference costs
Context pruning (`cache-ttl`)	Prunes stale messages after 6h	30-50% fewer input tokens on long sessions
Session initialization	Loads 8KB instead of 50KB on session start	80% fewer tokens per session start
Compaction at 40k	Distills context, flushes to memory files	Prevents runaway context that can 5-10x costs
Prompt caching	90% discount on stable system prompts	~$0.01/session instead of ~$0.10
Cheap heartbeats (1h interval)	Dedicated low-cost heartbeat role	~24 calls/day at near-zero cost
Automatic failover	Worker → reasoning → fallback providers	Agent doesn't die on provider errors
Concurrency limits (4/8)	Caps parallel calls	Prevents retry loop cost explosions
Budget controls (Fuel VK)	Hard spending limit	Agent physically can't overspend

Typical result: An autonomous agent running 8+ hours/day costs $0.30-1.00/day with Fuel vs $3-5/day with default config.

Pricing

Role	Input	Output	Use case
`fuel/worker`	~$0.28 / M tokens	~$0.42 / M tokens	Routine coding, subagents, file operations
`fuel/reasoning`	~$0.50 / M tokens	~$2.80 / M tokens	Architecture, complex reasoning, research
`fuel/heartbeat`	~$0.05 / M tokens	~$0.08 / M tokens	Heartbeats, status checks

Costs are approximate — we continuously optimize which providers and models back each role. Current model details are always visible at openclaw.rocks/fuel.

Your balance is visible at openclaw.rocks/fuel. When your balance runs out, calls return a clear 402 error. Top up and continue.

Advanced: Multi-Agent Routing

For coordinator/worker patterns, assign models by role:

agents:
  list:
    - id: main
      default: true
      # Inherits fuel/reasoning from defaults — complex reasoning

    - id: monitor
      model:
        primary: fuel/heartbeat
      # Read-only status checks — cheapest model

    - id: researcher
      model:
        primary: fuel/reasoning
      # Research needs reasoning model's tool orchestration

    - id: coder
      model:
        primary: fuel/worker
      # Routine coding tasks — worker model is sufficient

Rule: Only coordinator and research agents need reasoning. Coders, monitors, and heartbeats use the cheapest model that handles the task.

Region Preferences

By default, Fuel routes to the best model globally — regardless of where it comes from. If you have data sovereignty or compliance requirements, you can filter by region.

# Optional — filter by region (add to openclaw.json under models.providers.fuel)
# Default is all/all — no filtering, best globally.
#
# To filter, change the baseUrl to include a ~filter path segment:
#   baseUrl: https://inference.openclaw.rocks/v1/~eu-eu     # GDPR: EU origin + EU providers
#   baseUrl: https://inference.openclaw.rocks/v1/~all-us    # any origin, US providers only
#   baseUrl: https://inference.openclaw.rocks/v1/~us-us     # US origin + US providers
#   baseUrl: https://inference.openclaw.rocks/v1/~us,eu-us,eu  # US+EU origin, US+EU providers

Two dimensions:

Provider region (after the -): Where the API is physically hosted. Setting eu means your data only goes to EU-hosted APIs.
Model origin (before the -): Where the model company is based. Setting eu means only models from EU companies.

Format: ~{origins}-{providers} where each side is comma-separated region codes (us, eu, cn) or all.

Every region filter has full role coverage:

Filter	Worker	Reasoning	Heartbeat
`~all-all` (default)	DeepSeek V3 @ DeepSeek	Kimi K2.5 @ Together AI	Llama 8B @ Groq
`~all-us`	DeepSeek V3 @ Fireworks	Kimi K2.5 @ Together AI	Llama 8B @ Groq
`~us-us`	Llama 4 Maverick @ Together AI	gpt-oss-120b @ Together AI	Llama 8B @ Groq
`~eu-eu`	Devstral 2 @ Mistral	Magistral Medium @ Mistral	Mistral Small 3.2 @ OVHcloud

GDPR compliance

Use ~eu-eu for fully GDPR-compliant inference:

EU-sovereign providers only: Mistral (Paris), OVHcloud (Roubaix) — all EU-headquartered
Signed DPAs: All EU providers offer Data Processing Agreements
EU data residency: Prompts and responses never leave the EEA
No CLOUD Act: No US-headquartered companies in the data path
Stateless proxy: We don't log, store, or retain prompts or responses

See references/config-rationale.md for the full reasoning behind region filtering and GDPR compliance.

FAQ

Why semantic model names instead of real provider IDs? Decouples your agent config from infrastructure. When we swap providers or models, your config stays the same. No instance updates needed. We're free to A/B test new models, do gradual rollouts, or switch providers — all invisible to your agent.

How do you pick which models back each role? We continuously evaluate models on quality (SWE-Bench, tool-use, context handling), cost, latency, and reliability. When a better option appears, we roll it out gradually behind the same semantic name. Current model details are always visible at openclaw.rocks/fuel.

Is this just a proxy? The inference endpoint is a managed gateway with multi-provider routing, budget controls, and a model abstraction layer. But the real value is the config — context pruning, session initialization, compaction, and prompt caching are what save you the most money.

Can I use my own API keys instead? For inference, yes. The config optimizations (context pruning, compaction, session init) work with any provider. Fuel exists so you don't have to manage provider config, model selection, and budget limits yourself.

What if a provider goes down? Automatic failover kicks in — each role has fallback providers. If all providers fail, your agent gets a clear API error. Swap baseUrl back to your own provider key as a last resort.

My agent seems to forget things after compaction. Increase reserveTokensFloor to 30000 or keepLastAssistants to 5. Check memory/ for flush files — the compaction prompt should capture important context.

Handling 402: Credits Exhausted

When Fuel credits run out, the inference endpoint returns HTTP 402 Payment Required. Fuel uses the x402 protocol to make this response machine-readable.

What the agent should do on 402

Add this to the agent's system prompt or error handling:

402 HANDLING RULE:

When you receive HTTP 402 from the inference endpoint:
1. Stop making LLM calls immediately — they will all fail.
2. Save the current work state to memory.
3. Attempt autonomous top-up:
   a. POST https://openclaw.rocks/api/fuel/agent-checkout { "amount": 2000 }
   b. Open checkoutUrl in browser tool and complete payment.
   c. Poll GET https://openclaw.rocks/api/fuel/status?session_id=cs_xxx
   d. On "completed": resume work.
4. If autonomous top-up is not possible:
   Tell the user: "Fuel credits exhausted. Top up at https://openclaw.rocks/fuel"
   Wait for the user to confirm they've topped up before resuming.
5. Do NOT retry the failed request until credits are confirmed available.

Balance check API

GET https://openclaw.rocks/api/fuel/balance
Authorization: Bearer <supabase_session>

# 200 OK (has credits):
{
  "active": true,
  "budgetLimit": 20.0,
  "budgetUsed": 12.50,
  "remaining": 7.50,
  "remainingFormatted": "$7.50"
}

# 402 Payment Required (exhausted):
{
  "error": "Fuel credits exhausted",
  "balance": { "budgetLimit": 20.0, "budgetUsed": 20.0, "remaining": 0 },
  "topup": "https://openclaw.rocks/fuel"
}
# Also includes PAYMENT-REQUIRED header (x402 v2 compatible)

x402 protocol compatibility

The 402 response includes a PAYMENT-REQUIRED header with base64-encoded payment info following the x402 v2 spec. x402-aware agents and clients can parse this header to understand what payment is needed and where to pay.

Current scheme: fiat-redirect via Stripe (agent notifies user to top up).

Troubleshooting

Problem	Fix
`401 Unauthorized`	Check your virtual key. It should start with `vk-`.
`429 Too Many Requests`	Hit rate limits. Wait a moment or upgrade your plan.
`402 Budget Exceeded`	Credits exhausted. Top up at openclaw.rocks/fuel. See Handling 402 above.
Agent not using Fuel	Verify `models.providers.fuel` in config. Model IDs must start with `fuel/`.
Context growing too fast	Verify `contextPruning` is set. Add session init rules to system prompt.
Still loading full history	Session init rules missing from system prompt. Add the SESSION INITIALIZATION RULE.
Worker unavailable	Fallback to reasoning should be automatic. Check `model.fallbacks` in config.
Heartbeats too expensive	Verify `heartbeat.model` points to `fuel/heartbeat`.

Built by OpenClaw.rocks. Your AI agent. Live in seconds.

fuel

Safety Notice

Copy this and send it to your AI assistant to learn