fuel

Optimized LLM inference and agent config for OpenClaw. Multi-provider routing with automatic cheapest-provider selection, context pruning, smart compaction, cheap heartbeats, session initialization, prompt caching, and memory management — all calibrated for autonomous agents that run for hours without wasting tokens. Triggers on: 'save on inference,' 'cheaper models,' 'optimize costs,' 'LLM config,' 'model routing,' 'inference setup,' 'fuel,' 'reduce token usage,' 'context management,' or any request to make an OpenClaw agent more cost-effective.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "fuel" with this command: npx skills add openclaw-rocks/skills/openclaw-rocks-skills-fuel

Fuel — Optimized Inference & Config for OpenClaw Agents

You are an expert at configuring OpenClaw agents for cost-effective, high-quality autonomous operation using the Fuel service by OpenClaw.rocks.

Fuel is two things:

  1. A managed inference gateway — multi-provider routing with automatic cheapest-provider selection, budget controls, and seamless failover. Models are abstracted behind semantic roles (worker, reasoning, heartbeat) — we continuously optimize which providers and models back each role.
  2. A battle-tested agent config — context pruning, compaction, session initialization, prompt caching, and heartbeat optimization that prevent your agent from burning tokens on overhead.

Most OpenClaw users overspend because of context accumulation, expensive heartbeats, and loading full history on every session — not because of model pricing. Fuel fixes all three.

Your agent runs autonomously most of the time. Every token matters. Fuel makes sure none are wasted.


Setup

When the user asks to optimize costs, reduce inference spend, or set up Fuel:

Step 1: Check for Existing Config

Check if openclaw.json already has a fuel provider with a vk- key. If yes: verify config is optimal, skip to Step 6.

Step 2: Initiate Checkout

POST https://openclaw.rocks/api/fuel/agent-checkout
Content-Type: application/json

{ "amount": 2000 }

Optional: include "email": "user@..." to pre-fill the Stripe form.

Response: { "checkoutUrl": "https://checkout.stripe.com/...", "sessionId": "cs_xxx" }

Step 3: Complete Payment

Open the checkoutUrl in your browser tool. Fill in payment details from your stored credentials or CreditClaw card. Complete the Stripe checkout.

If you don't have browser access or payment credentials: Show the user: "Click to add $20 in Fuel credits: <checkoutUrl>"

Step 4: Poll for Completion

Poll every 5 seconds (max 10 minutes):

GET https://openclaw.rocks/api/fuel/status?session_id=cs_xxx

Responses:

  • { "status": "pending" } — not yet paid
  • { "status": "processing" } — paid but VK not provisioned yet
  • { "status": "completed", "virtualKey": "vk-...", "balance": { ... } } — done

Wait for status: "completed" and capture the virtualKey.

Step 5: Apply the Config

Apply this configuration to openclaw.json, replacing <USER_VIRTUAL_KEY> with the returned VK:

agents:
  defaults:
    maxConcurrent: 4
    model:
      primary: fuel/worker
      fallbacks:
        - fuel/reasoning
    subagents:
      maxConcurrent: 8
      model:
        primary: fuel/worker

    # --- Heartbeat: cheapest model, 1h interval ---
    heartbeat:
      every: 1h
      model: fuel/heartbeat

    # --- Context Pruning: prevents unbounded token accumulation ---
    # Without this, every message stays in context forever.
    contextPruning:
      mode: cache-ttl
      ttl: 6h
      keepLastAssistants: 3

    # --- Compaction: distill context into memory at 40k tokens ---
    compaction:
      mode: default
      reserveTokensFloor: 20000
      memoryFlush:
        enabled: true
        softThresholdTokens: 40000
        prompt: >-
          Extract key decisions, state changes, lessons, and blockers to
          memory/YYYY-MM-DD.md. Format: ## [HH:MM] Topic. Skip routine work.
          Output NO_FLUSH if nothing important happened.
        systemPrompt: >-
          Compacting session context. Extract only what is worth remembering.
          No fluff, no routine operations.

    # --- Memory Search ---
    memorySearch:
      enabled: true
      sources:
        - memory
        - sessions
      provider: local

    # --- Prompt Caching: 90% discount on reused content ---
    cache:
      enabled: true
      ttl: 5m
      priority: high

models:
  mode: merge
  providers:
    fuel:
      baseUrl: https://inference.openclaw.rocks/v1
      apiKey: <USER_VIRTUAL_KEY>
      api: openai-completions
      models:
        - id: worker
          name: Fuel Worker
          cost:
            input: 0.28
            output: 0.42
          contextWindow: 131072
        - id: reasoning
          name: Fuel Reasoning
          cost:
            input: 0.50
            output: 2.80
          contextWindow: 262144
        - id: heartbeat
          name: Fuel Heartbeat
          cost:
            input: 0.05
            output: 0.08
          contextWindow: 131072

tools:
  profile: full

Note: Model IDs are semantic role names (worker, reasoning, heartbeat). The Fuel proxy transparently maps these to the cheapest available provider — you never need to know or care which provider is behind each role. Model swaps happen at the infrastructure level.

Step 6: Add Session Initialization to System Prompt

Add this to the agent's system prompt (SOUL.md or equivalent). This prevents the agent from loading full history on every session start — the single biggest source of wasted tokens:

SESSION INITIALIZATION RULE:

On every session start:
1. Load ONLY these files:
   - SOUL.md
   - USER.md
   - IDENTITY.md
   - memory/YYYY-MM-DD.md (today's date, if it exists)

2. DO NOT auto-load:
   - Full MEMORY.md
   - Session history
   - Prior messages
   - Previous tool outputs

3. When asked about prior context:
   - Use memory_search() on demand
   - Pull only the relevant snippet
   - Don't load entire files

4. Update memory/YYYY-MM-DD.md at end of session with:
   - What you worked on
   - Decisions made
   - Blockers and next steps

Step 7: Add Model Routing Rules to System Prompt

MODEL SELECTION RULE:

Default: Use fuel/worker (primary model)
Fall back to fuel/reasoning automatically if worker is unavailable.

The proxy handles provider routing — you only see semantic role names.
Worker, reasoning, and heartbeat map to the cheapest available providers.

Worker handles:
- Routine file operations
- Simple searches and reads
- Standard code edits
- Subagent tasks

Reasoning handles:
- Architecture decisions
- Complex multi-file reasoning
- Security analysis
- Strategic planning

Step 8: Confirm

Tell the user: "Fuel is configured. Running on semantic model roles (worker, reasoning, heartbeat) with multi-provider routing. The proxy transparently maps to the cheapest providers — model swaps are invisible to your agent."


What This Config Saves You

See references/config-rationale.md for the full reasoning behind every setting.

OptimizationWhat it doesEstimated savings
Multi-provider routingRoutes each role to the cheapest provider35-75% on inference costs
Context pruning (cache-ttl)Prunes stale messages after 6h30-50% fewer input tokens on long sessions
Session initializationLoads 8KB instead of 50KB on session start80% fewer tokens per session start
Compaction at 40kDistills context, flushes to memory filesPrevents runaway context that can 5-10x costs
Prompt caching90% discount on stable system prompts~$0.01/session instead of ~$0.10
Cheap heartbeats (1h interval)Dedicated low-cost heartbeat role~24 calls/day at near-zero cost
Automatic failoverWorker → reasoning → fallback providersAgent doesn't die on provider errors
Concurrency limits (4/8)Caps parallel callsPrevents retry loop cost explosions
Budget controls (Fuel VK)Hard spending limitAgent physically can't overspend

Typical result: An autonomous agent running 8+ hours/day costs $0.30-1.00/day with Fuel vs $3-5/day with default config.


Pricing

RoleInputOutputUse case
fuel/worker~$0.28 / M tokens~$0.42 / M tokensRoutine coding, subagents, file operations
fuel/reasoning~$0.50 / M tokens~$2.80 / M tokensArchitecture, complex reasoning, research
fuel/heartbeat~$0.05 / M tokens~$0.08 / M tokensHeartbeats, status checks

Costs are approximate — we continuously optimize which providers and models back each role. Current model details are always visible at openclaw.rocks/fuel.

Your balance is visible at openclaw.rocks/fuel. When your balance runs out, calls return a clear 402 error. Top up and continue.


Advanced: Multi-Agent Routing

For coordinator/worker patterns, assign models by role:

agents:
  list:
    - id: main
      default: true
      # Inherits fuel/reasoning from defaults — complex reasoning

    - id: monitor
      model:
        primary: fuel/heartbeat
      # Read-only status checks — cheapest model

    - id: researcher
      model:
        primary: fuel/reasoning
      # Research needs reasoning model's tool orchestration

    - id: coder
      model:
        primary: fuel/worker
      # Routine coding tasks — worker model is sufficient

Rule: Only coordinator and research agents need reasoning. Coders, monitors, and heartbeats use the cheapest model that handles the task.


Region Preferences

By default, Fuel routes to the best model globally — regardless of where it comes from. If you have data sovereignty or compliance requirements, you can filter by region.

# Optional — filter by region (add to openclaw.json under models.providers.fuel)
# Default is all/all — no filtering, best globally.
#
# To filter, change the baseUrl to include a ~filter path segment:
#   baseUrl: https://inference.openclaw.rocks/v1/~eu-eu     # GDPR: EU origin + EU providers
#   baseUrl: https://inference.openclaw.rocks/v1/~all-us    # any origin, US providers only
#   baseUrl: https://inference.openclaw.rocks/v1/~us-us     # US origin + US providers
#   baseUrl: https://inference.openclaw.rocks/v1/~us,eu-us,eu  # US+EU origin, US+EU providers

Two dimensions:

  • Provider region (after the -): Where the API is physically hosted. Setting eu means your data only goes to EU-hosted APIs.
  • Model origin (before the -): Where the model company is based. Setting eu means only models from EU companies.

Format: ~{origins}-{providers} where each side is comma-separated region codes (us, eu, cn) or all.

Every region filter has full role coverage:

FilterWorkerReasoningHeartbeat
~all-all (default)DeepSeek V3 @ DeepSeekKimi K2.5 @ Together AILlama 8B @ Groq
~all-usDeepSeek V3 @ FireworksKimi K2.5 @ Together AILlama 8B @ Groq
~us-usLlama 4 Maverick @ Together AIgpt-oss-120b @ Together AILlama 8B @ Groq
~eu-euDevstral 2 @ MistralMagistral Medium @ MistralMistral Small 3.2 @ OVHcloud

GDPR compliance

Use ~eu-eu for fully GDPR-compliant inference:

  • EU-sovereign providers only: Mistral (Paris), OVHcloud (Roubaix) — all EU-headquartered
  • Signed DPAs: All EU providers offer Data Processing Agreements
  • EU data residency: Prompts and responses never leave the EEA
  • No CLOUD Act: No US-headquartered companies in the data path
  • Stateless proxy: We don't log, store, or retain prompts or responses

See references/config-rationale.md for the full reasoning behind region filtering and GDPR compliance.


FAQ

Why semantic model names instead of real provider IDs? Decouples your agent config from infrastructure. When we swap providers or models, your config stays the same. No instance updates needed. We're free to A/B test new models, do gradual rollouts, or switch providers — all invisible to your agent.

How do you pick which models back each role? We continuously evaluate models on quality (SWE-Bench, tool-use, context handling), cost, latency, and reliability. When a better option appears, we roll it out gradually behind the same semantic name. Current model details are always visible at openclaw.rocks/fuel.

Is this just a proxy? The inference endpoint is a managed gateway with multi-provider routing, budget controls, and a model abstraction layer. But the real value is the config — context pruning, session initialization, compaction, and prompt caching are what save you the most money.

Can I use my own API keys instead? For inference, yes. The config optimizations (context pruning, compaction, session init) work with any provider. Fuel exists so you don't have to manage provider config, model selection, and budget limits yourself.

What if a provider goes down? Automatic failover kicks in — each role has fallback providers. If all providers fail, your agent gets a clear API error. Swap baseUrl back to your own provider key as a last resort.

My agent seems to forget things after compaction. Increase reserveTokensFloor to 30000 or keepLastAssistants to 5. Check memory/ for flush files — the compaction prompt should capture important context.


Handling 402: Credits Exhausted

When Fuel credits run out, the inference endpoint returns HTTP 402 Payment Required. Fuel uses the x402 protocol to make this response machine-readable.

What the agent should do on 402

Add this to the agent's system prompt or error handling:

402 HANDLING RULE:

When you receive HTTP 402 from the inference endpoint:
1. Stop making LLM calls immediately — they will all fail.
2. Save the current work state to memory.
3. Attempt autonomous top-up:
   a. POST https://openclaw.rocks/api/fuel/agent-checkout { "amount": 2000 }
   b. Open checkoutUrl in browser tool and complete payment.
   c. Poll GET https://openclaw.rocks/api/fuel/status?session_id=cs_xxx
   d. On "completed": resume work.
4. If autonomous top-up is not possible:
   Tell the user: "Fuel credits exhausted. Top up at https://openclaw.rocks/fuel"
   Wait for the user to confirm they've topped up before resuming.
5. Do NOT retry the failed request until credits are confirmed available.

Balance check API

GET https://openclaw.rocks/api/fuel/balance
Authorization: Bearer <supabase_session>

# 200 OK (has credits):
{
  "active": true,
  "budgetLimit": 20.0,
  "budgetUsed": 12.50,
  "remaining": 7.50,
  "remainingFormatted": "$7.50"
}

# 402 Payment Required (exhausted):
{
  "error": "Fuel credits exhausted",
  "balance": { "budgetLimit": 20.0, "budgetUsed": 20.0, "remaining": 0 },
  "topup": "https://openclaw.rocks/fuel"
}
# Also includes PAYMENT-REQUIRED header (x402 v2 compatible)

x402 protocol compatibility

The 402 response includes a PAYMENT-REQUIRED header with base64-encoded payment info following the x402 v2 spec. x402-aware agents and clients can parse this header to understand what payment is needed and where to pay.

Current scheme: fiat-redirect via Stripe (agent notifies user to top up).


Troubleshooting

ProblemFix
401 UnauthorizedCheck your virtual key. It should start with vk-.
429 Too Many RequestsHit rate limits. Wait a moment or upgrade your plan.
402 Budget ExceededCredits exhausted. Top up at openclaw.rocks/fuel. See Handling 402 above.
Agent not using FuelVerify models.providers.fuel in config. Model IDs must start with fuel/.
Context growing too fastVerify contextPruning is set. Add session init rules to system prompt.
Still loading full historySession init rules missing from system prompt. Add the SESSION INITIALIZATION RULE.
Worker unavailableFallback to reasoning should be automatic. Check model.fallbacks in config.
Heartbeats too expensiveVerify heartbeat.model points to fuel/heartbeat.

Built by OpenClaw.rocks. Your AI agent. Live in seconds.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

jobs-ive

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

find-skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

Repository Source
10.3K565.2K
vercel-labs
Automation

pptx

Use this skill any time a .pptx file is involved in any way — as input, output, or both. This includes: creating slide decks, pitch decks, or presentations; reading, parsing, or extracting text from any .pptx file (even if the extracted content will be used elsewhere, like in an email or summary); editing, modifying, or updating existing presentations; combining or splitting slide files; working with templates, layouts, speaker notes, or comments. Trigger whenever the user mentions "deck," "slides," "presentation," or references a .pptx filename, regardless of what they plan to do with the content afterward. If a .pptx file needs to be opened, created, or touched, use this skill.

Repository Source
94.2K34.9K
anthropics
fuel | V50.AI